61 datasets found

Data from: Current and projected research data storage needs of Agricultural...
catalog.data.gov
agdatacommons.nal.usda.gov
+2more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
f
Data from: Excel Templates: A Helpful Tool for Teaching Statistics
tandf.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Quintela-del-Río; Mario Francisco-Fernández (2023). Excel Templates: A Helpful Tool for Teaching Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.3408052.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3408052.v2
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Alejandro Quintela-del-Río; Mario Francisco-Fernández
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
Data from: Delta Produce Sources Study
catalog.data.gov
agdatacommons.nal.usda.gov
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
Superstore Sales
kaggle.com
Updated Jul 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
gillh3877 (2023). Superstore Sales [Dataset]. https://www.kaggle.com/datasets/gillh3877/superstore-sales
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 7, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
gillh3877
Description
This dataset contains Superstore sales for last three months for three different locations A, B, & C. Project motivation to create visual dashboard for business manager to find out the weak areas where he can work to make each location more profitable. First step, I cleaned dataset to work on it. I changed date column to text. Created 8 pivot tables with graphs, and finally Excel dashboard. Thanks!
18 excel spreadsheets by species and year giving reproduction and growth...
catalog.data.gov
data.wu.ac.at
Updated Aug 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp
Explore at:
Dataset updated
Aug 17, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
Z
Dairy Supply Chain Sales Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Athanasios Liatifis (2024). Dairy Supply Chain Sales Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7853252
Explore at:
Dataset updated
Jul 12, 2024
Dataset provided by
Dimitrios Pliatsios
Christos Chaschatzis
Athanasios Liatifis
Dimitris Iatropoulos
Vasileios Argyriou
Thomas Lagkas
Panagiotis Sarigiannidis
Konstantinos Georgakidis
Anna Triantafyllou
Ilias Siniosoglou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
1.Introduction

Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.

One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.

This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.

Citation

Please cite the following papers when using this dataset:

I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted

Dataset Modalities

The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.

3.1 Data Collection

The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.

The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.

Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.

It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.

The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).

File

Period

Number of Samples (days)

product 1 2020.xlsx

01/01/2020–31/12/2020

363

product 1 2021.xlsx

01/01/2021–31/12/2021

364

product 1 2022.xlsx

01/01/2022–31/12/2022

365

product 2 2020.xlsx

01/01/2020–31/12/2020

363

product 2 2021.xlsx

01/01/2021–31/12/2021

364

product 2 2022.xlsx

01/01/2022–31/12/2022

365

product 3 2020.xlsx

01/01/2020–31/12/2020

363

product 3 2021.xlsx

01/01/2021–31/12/2021

364

product 3 2022.xlsx

01/01/2022–31/12/2022

365

product 4 2020.xlsx

01/01/2020–31/12/2020

363

product 4 2021.xlsx

01/01/2021–31/12/2021

364

product 4 2022.xlsx

01/01/2022–31/12/2022

364

product 5 2020.xlsx

01/01/2020–31/12/2020

363

product 5 2021.xlsx

01/01/2021–31/12/2021

364

product 5 2022.xlsx

01/01/2022–31/12/2022

365

product 6 2020.xlsx

01/01/2020–31/12/2020

362

product 6 2021.xlsx

01/01/2021–31/12/2021

364

product 6 2022.xlsx

01/01/2022–31/12/2022

365

product 7 2020.xlsx

01/01/2020–31/12/2020

362

product 7 2021.xlsx

01/01/2021–31/12/2021

364

product 7 2022.xlsx

01/01/2022–31/12/2022

365

3.2 Dataset Overview

The following table enumerates and explains the features included across all of the included files.

Feature

Description

Unit

Day

day of the month

-

Month

Month

-

Year

Year

-

daily_unit_sales

Daily sales - the amount of products, measured in units, that during that specific day were sold

units

previous_year_daily_unit_sales

Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year

units

percentage_difference_daily_unit_sales

The percentage difference between the two above values

%

daily_unit_sales_kg

The amount of products, measured in kilograms, that during that specific day were sold

kg

previous_year_daily_unit_sales_kg

Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year

kg

percentage_difference_daily_unit_sales_kg

The percentage difference between the two above values

kg

daily_unit_returns_kg

The percentage of the products that were shipped to selling points and were returned

%

previous_year_daily_unit_returns_kg

The percentage of the products that were shipped to selling points and were returned the previous year

%

points_of_distribution

The amount of sales representatives through which the product was sold to the market for this year

previous_year_points_of_distribution

The amount of sales representatives through which the product was sold to the market for the same day for the previous year

Table 1 – Dataset Feature Description

Structure and Format

4.1 Dataset Structure

The provided dataset has the following structure:

Where:

Name

Type

Property

Readme.docx

Report

A File that contains the documentation of the Dataset.

product X

Folder

A folder containing the data of a product X.

product X YYYY.xlsx

Data file

An excel file containing the sales data of product X for year YYYY.

Table 2 - Dataset File Description

Acknowledgement

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).

References

[1] MEVGAL is a Greek dairy production company
d
Easing into Excellent Excel Practices Learning Series / Série...
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcoux, Julie (2023). Easing into Excellent Excel Practices Learning Series / Série d'apprentissages en route vers des excellentes pratiques Excel [Dataset]. http://doi.org/10.5683/SP3/WZYO1F
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/WZYO1F
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Marcoux, Julie
Description
With a step-by-step approach, learn to prepare Excel files, data worksheets, and individual data columns for data analysis; practice conditional formatting and creating pivot tables/charts; go over basic principles of Research Data Management as they might apply to an Excel project. Avec une approche étape par étape, apprenez à préparer pour l’analyse des données des fichiers Excel, des feuilles de calcul de données et des colonnes de données individuelles; pratiquez la mise en forme conditionnelle et la création de tableaux croisés dynamiques ou de graphiques; passez en revue les principes de base de la gestion des données de recherche tels qu’ils pourraient s’appliquer à un projet Excel.
m
An Extensive Dataset for the Heart Disease Classification System
data.mendeley.com
Updated Feb 15, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.1
Explore at:
Unique identifier
https://doi.org/10.17632/65gxgy2nmg.1
Dataset updated
Feb 15, 2022
Authors
Sozan S. Maghdid
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70 A comprehensive database for factors that contribute to a heart attack has been constructed , The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. As a result, a form is created to accomplish this. Microsoft Excel was used to create this form. Figure 1 depicts the form which It has nine fields, where eight fields for input fields and one field for output field. Age, gender, heart rate, systolic BP, diastolic BP, blood sugar, CK-MB, and Test-Troponin are representing the input fields, while the output field pertains to the presence of heart attack, which is divided into two categories (negative and positive).negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.Table 1 show the detailed information and max and min of values attributes for 1319 cases in the whole database.To confirm the validity of this data, we looked at the patient files in the hospital archive and compared them with the data stored in the laboratories system. On the other hand, we interviewed the patients and specialized doctors. Table 2 is a sample for 1320 cases, which shows 44 cases and the factors that lead to a heart attack in the whole database,After collecting this data, we checked the data if it has null values (invalid values) or if there was an error during data collection. The value is null if it is unknown. Null values necessitate special treatment. This value is used to indicate that the target isn’t a valid data element. When trying to retrieve data that isn't present, you can come across the keyword null in Processing. If you try to do arithmetic operations on a numeric column with one or more null values, the outcome will be null. An example of a null values processing is shown in Figure 2.The data used in this investigation were scaled between 0 and 1 to guarantee that all inputs and outputs received equal attention and to eliminate their dimensionality. Prior to the use of AI models, data normalization has two major advantages. The first is to avoid overshadowing qualities in smaller numeric ranges by employing attributes in larger numeric ranges. The second goal is to avoid any numerical problems throughout the process.After completion of the normalization process, we split the data set into two parts - training and test sets. In the test, we have utilized1060 for train 259 for testing Using the input and output variables, modeling was implemented.
Immigration statistics data tables, year ending December 2020
gov.uk
Updated Feb 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Home Office (2021). Immigration statistics data tables, year ending December 2020 [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-statistics-data-tables-year-ending-december-2020
Explore at:
Dataset updated
Feb 25, 2021
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Home Office
Description
The Home Office has changed the format of the published data tables for a number of areas (asylum and resettlement, entry clearance visas, extensions, citizenship, returns, detention, and sponsorship). These now include summary tables, and more detailed datasets (available on a separate page, link below). A list of all available datasets on a given topic can be found in the ‘Contents’ sheet in the ‘summary’ tables. Information on where to find historic data in the ‘old’ format is in the ‘Notes’ page of the ‘summary’ tables.

The Home Office intends to make these changes in other areas in the coming publications. If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

Related content

Immigration statistics, year ending September 2020
Immigration Statistics Quarterly Release
Immigration Statistics User Guide
Publishing detailed data tables in migration statistics
Policy and legislative changes affecting migration to the UK: timeline
Immigration statistics data archives

Asylum and resettlement

https://assets.publishing.service.gov.uk/media/602bab69e90e070562513e35/asylum-summary-dec-2020-tables.xlsx">Asylum and resettlement summary tables, year ending December 2020 (MS Excel Spreadsheet, 359 KB)

Detailed asylum and resettlement datasets

Sponsorship

https://assets.publishing.service.gov.uk/media/602bab8fe90e070552b33515/sponsorship-summary-dec-2020-tables.xlsx">Sponsorship summary tables, year ending December 2020 (MS Excel Spreadsheet, 67.7 KB)

Detailed sponsorship datasets

Entry clearance visas granted outside the UK

https://assets.publishing.service.gov.uk/media/602bf8708fa8f50384219401/visas-summary-dec-2020-tables.xlsx">Entry clearance visas summary tables, year ending December 2020 (MS Excel Spreadsheet, 70.3 KB)

Detailed entry clearance visas datasets

Passenger arrivals (admissions)

https://assets.publishing.service.gov.uk/media/602bac148fa8f5037f5d849c/passenger-arrivals-admissions-summary-dec-2020-tables.xlsx">Passenger arrivals (admissions) summary tables, year ending December 2020 (MS Excel Spreadsheet, 70.6 KB)

Detailed Passengers initially refused entry at port datasets

Extensions

https://assets.publishing.service.gov.uk/media/602bac3d8fa8f50383c41f7c/extentions-summary-dec-2020-tables.xlsx">Extensions summary tables, year ending December 2020 (MS Excel Spreadsheet, 41.5 KB)

<a href="https://www.gov.uk/governmen
S
Annual Retail Store Data, 2000 [Canada] [Excel]
dataverse.scholarsportal.info
borealisdata.ca
+1more
pdf, xls
Updated Nov 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scholars Portal Dataverse (2021). Annual Retail Store Data, 2000 [Canada] [Excel] [Dataset]. https://dataverse.scholarsportal.info/dataset.xhtml;jsessionid=1283d69ee2dd528c9011fe4a2fe3?persistentId=hdl%3A10864%2F11351&version=&q=&fileTypeGroupFacet=&fileAccess=&fileTag=%22Tables%22&fileSortField=&fileSortOrder=
Explore at:
xls(2165760), xls(29696), xls(2920448), pdf(76787), pdf(158404), xls(34816), xls(2754048), pdf(81084), pdf(71183), xls(34304), xls(625664), xls(2707968), xls(695808), pdf(70673), pdf(72585), xls(576512), xls(609792), xls(28672), pdf(60236), pdf(30338), pdf(87181), pdf(84140), pdf(92012), xls(610304), pdf(74439), xls(2471424), pdf(73788), xls(30208), pdf(74478), pdf(53645)Available download formats
Dataset updated
Nov 17, 2021
Dataset provided by
Scholars Portal Dataverse
Area covered
Canada, Canada
Description
The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.
o
Data for: Sustainable connectivity in a community repository
explore.openaire.eu
data.niaid.nih.gov
+3more
Updated Dec 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ted Habermann (2023). Data for: Sustainable connectivity in a community repository [Dataset]. http://doi.org/10.5061/dryad.nzs7h44xr
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.nzs7h44xr
Dataset updated
Dec 7, 2023
Authors
Ted Habermann
Description
Data For: Sustainable Connectivity in a Community Repository ## GENERAL INFORMATION This readme.txt file was generated on 30231110 by Ted Habermann ### Title of Dataset Data For: Sustainable Connectivity in a Community Repository ### Author Information Principal Investigator Contact Information Name: Ted Habermann (0000-0003-3585-6733) Institution: Metadata Game Changers () Email: ORCID: 0000-0003-3585-6733 ### Date published or finalized for release: November 10, 2023 ## Date of data collection (single date, range, approximate date) May and June 2023 ### Information about funding sources that supported the collection of the data: National Science Foundation (Crossref Funder ID: 100000001) Award 2134956. ### Overview of the data (abstract): These data are Dryad metadata retrieved from and translated into csv files. There are two datasets: 1. DryadJournalDataset was retrieved from Dryad using the ISSNs in the file DryadJournalDataset_ISSNs.txt, although some had no data. 2. DryadOrganizationDataset was retrieved from Dryad using the RORs in the file DryadOrganizationDataset_RORs.txt, although some had no data. Each dataset includes four types of metadata: identifiers, funders, keywords, and related works, each in a separate comma (.csv) or tab (.tsv) delimited files. There are also Microsoft Excel files (.xlsx) for the identifier metadata and connectivity summaries for each dataset (*.html). The connectivity summaries include summaries of each parameter in all four data files with definitions, counts, unique counts, most frequent values, and completeness. These data formed the basis for an analysis of the connectivity of the Dryad repository for organizations, funders, and people. | Size | FileName | | --------: | :--------------------------------------------------------- | | 90541505 | DryadJournalDataset_Identifiers_20230520_12.csv | | 9017051 | DryadJournalDataset_funders_20230520_12.tsv | | 29108477 | DryadJournalDataset_keywords_20230520_12.tsv | | 8833842 | DryadJournalDataset_relatedWorks_20230520_12.tsv | | | | | 18260935 | DryadOrganizationDataset_funders_20230601_12.tsv | | 240128730 | DryadOrganizationDataset_identifiers_20230601_12.tsv | | 39600659 | DryadOrganizationDataset_keywords_20230601_12.tsv | | 11520475 | DryadOrganizationDataset_relatedWorks_20230601_12.tsv | | | | | 40726143 | DryadJournalDataset_identifiers_20230520_12.xlsx | | 81894301 | DryadOrganizationDataset_identifiers_20230601_12.xlsx | | | | | 842827 | DryadJournalDataset_ConnectivitySummary.html | | 387551 | DryadOrganizationDataset_ConnectivitySummary.html | ### Field Definitions ## SHARING/ACCESS INFORMATION ### Licenses/restrictions placed on the data: Creative Commons Public Domain License (CC0) ### Links to publications that cite or use the data: TBD ### Was data derived from another source? No ## DATA & FILE OVERVIEW ### File List A. *Dataset_identifiers_YYYYMMDD_HH.*sv: Short description: Identifier metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. B. *Dataset_funders_YYYYMMDD_HH.*sv: Short description: Funder metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. C. *Dataset_keywords_YYYYMMDD_HH.*sv: Short description: Keyword metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. D. *Dataset_relatedWorks_YYYYMMDD_HH.*sv: Short description: Related work metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. E. *Dataset_identifiers_YYYYMMDD_HH.xlsx: Short description: Excel spreadsheet with identifier metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. F. *Dataset_ConnectivitySummary.html: Short description: Connectivity summary for Dataset. G. summarizeConnectivity.ipynb Short description: Python notebook with code for creating connectivity summaries and plots. ### Relationship between files: All files with the same dataset name make up a dataset. The .*sv are original metadata extracted from Dryad. ## METHODOLOGICAL INFORMATION ### Description of methods used for collection/generation of data: Most of the analysis is simply extracting and comparing counts of various metadata elements. ## DATA-SPECIFIC INFORMATION See connectivity summaries (*ConnectivitySummary.html) for a list of parameters in each file and summaries of their values. ### Identifier Metadata The identifier metadata datasets include the following fields: | Field | Definition | | :------------------------------- | :--------------------------------------------------------------------------------------------------- | | DOI | Digital object identifier for the dataset | | title | Title for the dataset | | datePublished | Date dataset published | | relatedPublicationISSN | International Standard Serial Number for journal with related publication | | primary_article | Digital object identifier for pr...

Galilee geological model 25-05-15

data.gov.au
researchdata.edu.au

zip

Updated Apr 13, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Bioregional Assessment Program (2022). Galilee geological model 25-05-15 [Dataset]. https://data.gov.au/data/dataset/bd1c35a0-52c4-421b-ac7d-651556670eb9

Explore at:

zip(122560650)Available download formats

Dataset updated

Apr 13, 2022

Dataset authored and provided by

Bioregional Assessment Program

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Abstract

This dataset was derived by the Bioregional Assessment Programme. The parent datasets are identified in the Lineage statement in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset comprises of interpreted elevation surfaces and contours for the major Triassic and Upper Permian units of the Galilee Geological Basin.

Purpose

This dataset was created to provide formation extents for aquifers in the Galilee geological basin

Dataset History

A Quality Assurance (QA) and validation process was conducted on the original well and bore data to choose wells/bores that are within 25 kilometres of the BA Galilee Region extent.

The QA/Validation process is as follows:

Well data

a. Obtained excel file "QPED_July_2013_galilee.xlsx" from GA

b. Based on stratigraphic information in "BH_costrat" tab formation names were regularised and simplified based on current naming conventions.

c. Simplified names added to QPED_July_2013_galileet.xlsx as "Steve_geo" and "Steve_group"

d. Produced new file "GSQ_Geology.xlsx" contained decimal latitude and longitude, KB elevation, top of unit in metres from KB, top of unit in metres AHD, bottom of unit in metres from KB, bottom of unit in metres AHD, original geology, simplified geology, simplified Group geology.
```
 i.     KB obtained from "BH_wellhist"

 ii.    Where no KB information was available ie KB=0, sample the 1S DEM at the well's location to obtain height. KB=DEM+10. Marked well as having lower reliability.

 iii.    Calculated Top_m_AHD = KB - Top_m_KB

 iv.    Calculated Bottom_m_AHD = KB - Bottom_m_KB
```
e. Brought GSQ_Geology.xlsx into ArcGIS

f. Selected wells based on "Steve_geo" field for each model layer to produce a geodatabase for each layer.
```
 i.     GSQ_basement_wells

 ii.    GSQ_top_joe_joe_group

 iii.    GSQ_top_bandanna_merge

 iv.    GSQ_rewan_group

 v.     GSQ_clematis

 vi.    GSQ_moolyember
```
g. Additional wells and reinterpreted tops added to appropriate geodatabase based on well completion reports

h. Additional wells added to coverages to help model building process
```
 i.     Well_name listed as Fake

 ii.    Exception being GSQ_top_basement_fake which was created as a separate geodatabase
```

Bore data

a. Obtained QLD_DNRM_GroundwaterDatabaseExtract_20131111 from GA

b. Used files REGISTRATIONS.txt, ELEVATIONS.txt and AQUIFER.txt to build GW_stratigraphy.xlsx

 i.     Based on RN

 ii.    Latitude from GIS_LAT (REGISTRATIONS.txt)

 iii.    Longitude from GIS_LNG (REGISTRATIONS.txt)

 iv.    Elevation from (ELEVATIONS.txt)

 v.     FORM_DESC from (AQUIFER.txt)

 vi.    Top from (AQUIFER.txt)

 vii.    Bottom from (AQUIFER.txt)

c. Brought GW_stratigraphy.xlsx into ArcGIS

d. Created gw_bores_galilee_dem

 i.     Sampled 1S DEM to obtain ground level elevation column RASTERVALU

 ii.    Created column top_m_AHD by RASTERVALU - Top

e. Selected bores based on "FORM_DESC" field for each model layer to produce a geodatabase for each layer.

 i.     Gw_basement

 ii.    GW_bores_joe_joe_group

 iii.    GW_bores_bandanna

 iv.    Gw_bores_rewan

 v.     Gw_bores_clematis

 vi.    Gw_bores_moolyember

Georectified seismic surfaces

a. Extracted interpreted seismic surfaces for base Permian (interpreted as basement) and top Bandanna (in time) from the following seismic surveys

 i.     Y80A, W81A, Carmichael, Pendine, T81A, Quilpie, Ward and Powell Creek seismic survey downloaded https://qdexguest.deedi.qld.gov.au/portal/site/qdex/search?searchType=general 

 ii.    Brought TIF images into ArcGIS and georectified

 iii.    Digitised shape of contours and faults into geodatabase

     1.   Basement_contours and basement_faults

     2.   bandanna_contours_new_data and bandanna_faults

 iv.    Added field "contour" to geodatabase

 v.     Converted contours to depth in "contour" field based on well and bore data (top_m_AHD) and contour progression

 vi.    Use the shape and depth derived from OZ SEEBASE to help to add additional contours and faults to basement and bandanna datasets

Additional contour and fault surfaces were built derived from underlying surfaces and wells/bore data

a. Joejoe_contours and joejoe)faults

b. Rewan_contour_clip (used bandanna_faults as fault coverage)

c. Clematis_contour and clematis_faults

d. Moolyember_contour (used clematis_faults as fault coverage)

Surface geology

a. Extracted surface geology from QUEENSLAND GEOLOGY_AUGUST_2012 using Galilee BA region boundary with 25 kilometre boundary to form geodatabase QLD_geology_galilee

b. Selected relevant surface geology from QLD_geology_galilee based on field "Name" for each model layer and created new geodatabase layers

 i.     Basement_geology: Argentine Metamorphics,Running River Metamorphics,Charters Towers Metamorphics; Bimurra Volcanics, Foyle Volcanics, Mount Wyatt Formation, Saint Anns Formation, Silver Hills Volcanics, Stones Creek Volcanics; Bulliwallah Formation, Ducabrook Formation, Mount Rankin Formation, Natal Formation, Star of Hope Formation; Cape River Metamorphics; Einasleigh Metamorphics; Gem Park Granite; Macrossan Province Cambrian-Ordovician intrusives; Macrossan Province Ordovician-Silurian intrusives; Macrossan Province Ordovician intrusives; Mount Formartine, unnamed plutonic units; Pama Province Silurian-Devonian intrusives; Seventy Mile Range Group; and Kirk River beds, Les Jumelles beds.

 ii.    Joe_joe_geology: Joe Joe Group

 iii.    Galilee_permian_geology: Back Creek Group, Betts Creek Group, Blackwater Group

 iv.    Rewan_geology: Rewan Group

    1.    Later also made dunda_beds_geology to be included in Rewan model: Dunda beds

 v.     Clematis_geology: Clematis Group

    1.    Later also made warang_sandstone_geology to be included in Clematis model: Warang Sandstone

 vi.    Moolyember_surface_geology: Moolyember Formation

DEM for each model layer

a. Using surface geology geodatabase extent extract grid from dem_s_1s to represent the top of the model layer at the surface

 i.     Basement_dem

 ii.    Joejoe_dem

 iii.    Bandanna_dem

 iv.    Rewan_dem and dunda_dem

 v.     Clematis_dem and warang_dem

 vi.    Moolyember_surface_dem

b. Used Contour tool in ArcGIS to obtain a 25 metre contour geodatabase from the relevant model DEM

 i.     Basement_dem_contours

 ii.    Joejoe_dem_contours

 iii.    Bandanna_dem_contours

 iv.    Rewan_dem_contours and dunda_dem_contours

 v.     Clematis_dem_contours and warang_dem_contours

 vi.    Moolyember_dem_contours

c. For the purpose of guiding the model building process additional fields were added to each DEM contour geodatabase was added based on average thickness derived from groundwater bores and petroleum wells.

 i.     Basement_dem_contours: Joejoe, bandanna, rewan, clematis, moolyember

 ii.    Joejoe_dem_contours: basement, bandanna

 iii.    Bandanna_dem_contours: joejoe, rewan

 iv.    Rewan_dem_contours and dunda_dem_contours: clematis, rewan

 v.     Clematis_dem_contours and warang_dem_contours: moolyember, rewan

 vi.  Moolyember_dem_contours: clematis

The model building process is as follows:

Used the tope to raster tool to create surface based on the following rules

a. Environment

    i.  Extent

       1. Top: -19.7012030024424

       2. Right: 148.891511819054

       3. Bottom: -27.5812030024424

       4. Left: 139.141511819054

    ii. Output cell size: 0.01 degrees

    iii. Drainage enforcement: No_enforce

b. Input

    i.  Basement

       1. Basement_dem_contour; field - contour; type - contour

       2. Joejoe_dem_contour; field - basement; type - contour

       3. Basement_contour; field - contour; type - contour

       4. GSQ_basement_wells; field - top_m_AHD; type - point elevation

       5. GW_basement; field - top_m_AHDl type - point elevation

       6. GSQ_top_basement_fake; field - top_m_AHDl type - point elevation

       7. Basement_faults; type - cliff

   ii.  Joe Joe Group

       1. Joejoe_dem_contour; field - basement; type - contour

       2. Basement_dem_contour; field - joejoe; type - contour

       3. permian_dem_contour; field - joejoe, type - contour

       4. joejoe_contour; field - joejoe; type - contour

       5. GSQ_top_joejoe_group; field - top_m_AHD; type - point elevation

       6. GW_bores_joe_joe_group; field - top_m_AHDl type - point elevation

       7. joejoe_faults; type - cliff

   iii.  Bandanna Group

       1. Permian_dem_contour; field - contour; type - contour

       2. Joejoe_dem_contour; field - bandanna; type - contour

       3. Rewan_dem_contour: field - bandanna; type - contour

       4. Dunda_dem_contour; field - bandanna; type - contour

FOI: early years dataset as at 31 March 2016
gov.uk
s3.amazonaws.com
Updated Jul 21, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ofsted (2021). FOI: early years dataset as at 31 March 2016 [Dataset]. https://www.gov.uk/government/statistical-data-sets/foi-early-years-dataset-as-at-31-march-2016
Explore at:
Dataset updated
Jul 21, 2021
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Ofsted
Description
There is a requirement that public authorities, like Ofsted, must publish updated versions of datasets which are disclosed as a result of Freedom of Information requests.

Some information which is requested is exempt from disclosure to the public under the Freedom of Information Act; it is therefore not appropriate for this information to be made available. Examples of information which it is not appropriate to make available includes the locations of women’s refuges, some military bases and all children’s homes and the personal data of providers and staff. Ofsted also considers that the names and addresses of registered childminders are their personal data which it is not appropriate to make publicly available unless those individuals have given their explicit consent to do so. This information has therefore not been included in the datasets.

Data for both childcare and childminders are included in the excel file.

https://assets.publishing.service.gov.uk/media/60f7f6a4d3bf7f568160edb1/FOI_early_years_dataset_as_at_31_March_2016.xlsx">

https://assets.publishing.service.gov.uk/media/60f7f6a4d3bf7f568160edb1/FOI_early_years_dataset_as_at_31_March_2016.xlsx">FOI: early years dataset as at 31 March 2016

MS Excel Spreadsheet, 16.6 MB This file may not be suitable for users of assistive technology. <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">

Request an accessible format.

If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:enquiries@ofsted.gov.uk" target="_blank" class="govuk-link">enquiries@ofsted.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.

https://assets.publishing.service.gov.uk/media/5c1a3743ed915d0b9211b9df/FOI_early_years_dataset_as_at_31_March_2016_CM_new.csv">

https://assets.publishing.service.gov.uk/media/5c1a3743ed915d0b9211b9df/FOI_early_years_dataset_as_at_31_March_2016_CM_new.csv">FOI: early years dataset as at 31 March 2016: childminders
B
V-Dem v 12 dataset, all variables, trimmed to 12 countries
borealisdata.ca
search.dataone.org
Updated Mar 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wally Seccombe (2024). V-Dem v 12 dataset, all variables, trimmed to 12 countries [Dataset]. http://doi.org/10.5683/SP3/988EOU
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/988EOU
Dataset updated
Mar 27, 2024
Dataset provided by
Borealis
Authors
Wally Seccombe
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
From the massive set of V-Dem v 12 variables, we have inserted 27 in the main CPEDB dataset. Here is the entire set, organized in an excel file to match their country/year rows in the main SPSS file. This precise correspondence makes it easy to insert other variables from the V-Dem dataset into the main file where they can be statistically combined with a wide variety of variables from other sources.
m
Global Burden of Disease analysis dataset of noncommunicable disease...
data.mendeley.com
Updated Apr 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Cundiff (2023). Global Burden of Disease analysis dataset of noncommunicable disease outcomes, risk factors, and SAS codes [Dataset]. http://doi.org/10.17632/g6b39zxck4.10
Explore at:
Unique identifier
https://doi.org/10.17632/g6b39zxck4.10
Dataset updated
Apr 6, 2023
Authors
David Cundiff
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.

The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.

These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis. The data include the following: 1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc). 2. A text file to import the analysis database into SAS 3. The SAS code to format the analysis database to be used for analytics 4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6 5. SAS code for deriving the multiple regression formula in Table 4. 6. SAS code for deriving the multiple regression formula in Table 5 7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8 9. The Excel files that accompanied the above SAS code to produce the tables

For questions, please email davidkcundiff@gmail.com. Thanks.
m
Sapota Fruit Datasets
data.mendeley.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anita Bhatt (2025). Sapota Fruit Datasets [Dataset]. http://doi.org/10.17632/jgtb95x6kf.2
Explore at:
Unique identifier
https://doi.org/10.17632/jgtb95x6kf.2
Dataset updated
Jul 11, 2025
Authors
Anita Bhatt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These datasets support research in quality control, classification, and visual recognition of Sapota fruit. Dataset-1 and Dataset-2 consist of images of Sapota fruit, categorized based on their visual and physical characteristics. The images are classified into two groups: spoiled fruit and fresh fruit, with the fresh fruit further sorted according to size—small, medium, and large. Dataset-1 contains images sized 224x224 pixels with a fixed white background. Dataset-2 includes images at their original size of 4000x3000 pixels, with the background removed and annotations saved in a text file. Physical parameters for the original dataset are saved in Excel files, which eventually make Dataset-1 and Dataset-2. The Dataset-3 includes images captured under various backgrounds, showcasing different sizes of Sapota fruit. Overall, this dataset aims to facilitate research in quality control, classification, and visual recognition of Sapota fruit. Dataset-1 includes images sized 224x224 pixels with a fixed white background, while Dataset-2 features images at their original size of 4000x3000 pixels, with the background removed and annotations saved in a text file. Dataset-3 contains images captured under various backgrounds, showcasing different sizes of Sapota fruit.

Overall, these datasets are designed to support research on Sapota fruit quality.
Graph Input Data Example.xlsx
figshare.com
xlsx
Updated Dec 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Corynen (2018). Graph Input Data Example.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.7506209.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7506209.v1
Dataset updated
Dec 26, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Dr Corynen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The various performance criteria applied in this analysis include the probability of reaching the ultimate target, the costs, elapsed times and system vulnerability resulting from any intrusion. This Excel file contains all the logical, probabilistic and statistical data entered by a user, and required for the evaluation of the criteria. It also reports the results of all the computations.
g
Waterworks — intake point reporting
gimi9.com
data.europa.eu
Updated Feb 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Waterworks — intake point reporting [Dataset]. https://gimi9.com/dataset/eu_https-data-norge-no-node-1499/
Explore at:
Dataset updated
Feb 2, 2022
Description
The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. Input point_reporting. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make data for drinking water supply available to the public.
g
Delta Produce Sources Study | gimi9.com
gimi9.com
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Delta Produce Sources Study | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_delta-produce-sources-study-51a7a
Explore at:
Dataset updated
Feb 12, 2021
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Resource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
f
Cleaned NHANES 1988-2018
figshare.com
txt
Updated Feb 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21743372.v9
Dataset updated
Feb 18, 2025
Dataset provided by
figshare
Authors
Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.

Facebook

Twitter

Click to copy link

Link copied

Cite

Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016

Explore at:

Dataset updated

Apr 21, 2025

Dataset provided by

Agricultural Research Servicehttps://www.ars.usda.gov/

Description

The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

Clear search

Close search

Google apps

Main menu

Data from: Current and projected research data storage needs of Agricultural...

Data from: Excel Templates: A Helpful Tool for Teaching Statistics

Data from: Delta Produce Sources Study

Superstore Sales

18 excel spreadsheets by species and year giving reproduction and growth...

Dairy Supply Chain Sales Dataset

Easing into Excellent Excel Practices Learning Series / Série...

An Extensive Dataset for the Heart Disease Classification System

Immigration statistics data tables, year ending December 2020

Related content

Asylum and resettlement

Sponsorship

Entry clearance visas granted outside the UK

Passenger arrivals (admissions)

Extensions

Annual Retail Store Data, 2000 [Canada] [Excel]

Data for: Sustainable connectivity in a community repository

Galilee geological model 25-05-15

Abstract

Purpose

Dataset History

FOI: early years dataset as at 31 March 2016

https://assets.publishing.service.gov.uk/media/60f7f6a4d3bf7f568160edb1/FOI_early_years_dataset_as_at_31_March_2016.xlsx">FOI: early years dataset as at 31 March 2016

https://assets.publishing.service.gov.uk/media/5c1a3743ed915d0b9211b9df/FOI_early_years_dataset_as_at_31_March_2016_CM_new.csv">FOI: early years dataset as at 31 March 2016: childminders

V-Dem v 12 dataset, all variables, trimmed to 12 countries

Global Burden of Disease analysis dataset of noncommunicable disease...

Sapota Fruit Datasets

Graph Input Data Example.xlsx

Waterworks — intake point reporting

Delta Produce Sources Study | gimi9.com

Cleaned NHANES 1988-2018

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016See More Versions

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016