61 datasets found
  1. Data from: Current and projected research data storage needs of Agricultural...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

  2. f

    Data from: Excel Templates: A Helpful Tool for Teaching Statistics

    • tandf.figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Quintela-del-Río; Mario Francisco-Fernández (2023). Excel Templates: A Helpful Tool for Teaching Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.3408052.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Alejandro Quintela-del-Río; Mario Francisco-Fernández
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.

  3. Data from: Delta Produce Sources Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  4. Superstore Sales

    • kaggle.com
    Updated Jul 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gillh3877 (2023). Superstore Sales [Dataset]. https://www.kaggle.com/datasets/gillh3877/superstore-sales
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    gillh3877
    Description

    This dataset contains Superstore sales for last three months for three different locations A, B, & C. Project motivation to create visual dashboard for business manager to find out the weak areas where he can work to make each location more profitable. First step, I cleaned dataset to work on it. I changed date column to text. Created 8 pivot tables with graphs, and finally Excel dashboard. Thanks!

  5. 18 excel spreadsheets by species and year giving reproduction and growth...

    • catalog.data.gov
    • data.wu.ac.at
    Updated Aug 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp
    Explore at:
    Dataset updated
    Aug 17, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).

  6. Z

    Dairy Supply Chain Sales Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Athanasios Liatifis (2024). Dairy Supply Chain Sales Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7853252
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Dimitrios Pliatsios
    Christos Chaschatzis
    Athanasios Liatifis
    Dimitris Iatropoulos
    Vasileios Argyriou
    Thomas Lagkas
    Panagiotis Sarigiannidis
    Konstantinos Georgakidis
    Anna Triantafyllou
    Ilias Siniosoglou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1.Introduction

    Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.

    One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.

    This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.

    1. Citation

    Please cite the following papers when using this dataset:

    I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted

    1. Dataset Modalities

    The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.

    3.1 Data Collection

    The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.

    The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.

    Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.

    It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.

    The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).

    File

    Period

    Number of Samples (days)

    product 1 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 1 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 1 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 2 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 2 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 2 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 3 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 3 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 3 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 4 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 4 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 4 2022.xlsx

    01/01/2022–31/12/2022

    364

    product 5 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 5 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 5 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 6 2020.xlsx

    01/01/2020–31/12/2020

    362

    product 6 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 6 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 7 2020.xlsx

    01/01/2020–31/12/2020

    362

    product 7 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 7 2022.xlsx

    01/01/2022–31/12/2022

    365

    3.2 Dataset Overview

    The following table enumerates and explains the features included across all of the included files.

    Feature

    Description

    Unit

    Day

    day of the month

    -

    Month

    Month

    -

    Year

    Year

    -

    daily_unit_sales

    Daily sales - the amount of products, measured in units, that during that specific day were sold

    units

    previous_year_daily_unit_sales

    Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year

    units

    percentage_difference_daily_unit_sales

    The percentage difference between the two above values

    %

    daily_unit_sales_kg

    The amount of products, measured in kilograms, that during that specific day were sold

    kg

    previous_year_daily_unit_sales_kg

    Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year

    kg

    percentage_difference_daily_unit_sales_kg

    The percentage difference between the two above values

    kg

    daily_unit_returns_kg

    The percentage of the products that were shipped to selling points and were returned

    %

    previous_year_daily_unit_returns_kg

    The percentage of the products that were shipped to selling points and were returned the previous year

    %

    points_of_distribution

    The amount of sales representatives through which the product was sold to the market for this year

    previous_year_points_of_distribution

    The amount of sales representatives through which the product was sold to the market for the same day for the previous year

    Table 1 – Dataset Feature Description

    1. Structure and Format

    4.1 Dataset Structure

    The provided dataset has the following structure:

    Where:

    Name

    Type

    Property

    Readme.docx

    Report

    A File that contains the documentation of the Dataset.

    product X

    Folder

    A folder containing the data of a product X.

    product X YYYY.xlsx

    Data file

    An excel file containing the sales data of product X for year YYYY.

    Table 2 - Dataset File Description

    1. Acknowledgement

    This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).

    References

    [1] MEVGAL is a Greek dairy production company

  7. d

    Easing into Excellent Excel Practices Learning Series / Série...

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcoux, Julie (2023). Easing into Excellent Excel Practices Learning Series / Série d'apprentissages en route vers des excellentes pratiques Excel [Dataset]. http://doi.org/10.5683/SP3/WZYO1F
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Marcoux, Julie
    Description

    With a step-by-step approach, learn to prepare Excel files, data worksheets, and individual data columns for data analysis; practice conditional formatting and creating pivot tables/charts; go over basic principles of Research Data Management as they might apply to an Excel project. Avec une approche étape par étape, apprenez à préparer pour l’analyse des données des fichiers Excel, des feuilles de calcul de données et des colonnes de données individuelles; pratiquez la mise en forme conditionnelle et la création de tableaux croisés dynamiques ou de graphiques; passez en revue les principes de base de la gestion des données de recherche tels qu’ils pourraient s’appliquer à un projet Excel.

  8. m

    An Extensive Dataset for the Heart Disease Classification System

    • data.mendeley.com
    Updated Feb 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.1
    Explore at:
    Dataset updated
    Feb 15, 2022
    Authors
    Sozan S. Maghdid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70 A comprehensive database for factors that contribute to a heart attack has been constructed , The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. As a result, a form is created to accomplish this. Microsoft Excel was used to create this form. Figure 1 depicts the form which It has nine fields, where eight fields for input fields and one field for output field. Age, gender, heart rate, systolic BP, diastolic BP, blood sugar, CK-MB, and Test-Troponin are representing the input fields, while the output field pertains to the presence of heart attack, which is divided into two categories (negative and positive).negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.Table 1 show the detailed information and max and min of values attributes for 1319 cases in the whole database.To confirm the validity of this data, we looked at the patient files in the hospital archive and compared them with the data stored in the laboratories system. On the other hand, we interviewed the patients and specialized doctors. Table 2 is a sample for 1320 cases, which shows 44 cases and the factors that lead to a heart attack in the whole database,After collecting this data, we checked the data if it has null values (invalid values) or if there was an error during data collection. The value is null if it is unknown. Null values necessitate special treatment. This value is used to indicate that the target isn’t a valid data element. When trying to retrieve data that isn't present, you can come across the keyword null in Processing. If you try to do arithmetic operations on a numeric column with one or more null values, the outcome will be null. An example of a null values processing is shown in Figure 2.The data used in this investigation were scaled between 0 and 1 to guarantee that all inputs and outputs received equal attention and to eliminate their dimensionality. Prior to the use of AI models, data normalization has two major advantages. The first is to avoid overshadowing qualities in smaller numeric ranges by employing attributes in larger numeric ranges. The second goal is to avoid any numerical problems throughout the process.After completion of the normalization process, we split the data set into two parts - training and test sets. In the test, we have utilized1060 for train 259 for testing Using the input and output variables, modeling was implemented.

  9. Immigration statistics data tables, year ending December 2020

    • gov.uk
    Updated Feb 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2021). Immigration statistics data tables, year ending December 2020 [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-statistics-data-tables-year-ending-december-2020
    Explore at:
    Dataset updated
    Feb 25, 2021
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    The Home Office has changed the format of the published data tables for a number of areas (asylum and resettlement, entry clearance visas, extensions, citizenship, returns, detention, and sponsorship). These now include summary tables, and more detailed datasets (available on a separate page, link below). A list of all available datasets on a given topic can be found in the ‘Contents’ sheet in the ‘summary’ tables. Information on where to find historic data in the ‘old’ format is in the ‘Notes’ page of the ‘summary’ tables.

    The Home Office intends to make these changes in other areas in the coming publications. If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

    Related content

    Immigration statistics, year ending September 2020
    Immigration Statistics Quarterly Release
    Immigration Statistics User Guide
    Publishing detailed data tables in migration statistics
    Policy and legislative changes affecting migration to the UK: timeline
    Immigration statistics data archives

    Asylum and resettlement

    https://assets.publishing.service.gov.uk/media/602bab69e90e070562513e35/asylum-summary-dec-2020-tables.xlsx">Asylum and resettlement summary tables, year ending December 2020 (MS Excel Spreadsheet, 359 KB)

    Detailed asylum and resettlement datasets

    Sponsorship

    https://assets.publishing.service.gov.uk/media/602bab8fe90e070552b33515/sponsorship-summary-dec-2020-tables.xlsx">Sponsorship summary tables, year ending December 2020 (MS Excel Spreadsheet, 67.7 KB)

    Detailed sponsorship datasets

    Entry clearance visas granted outside the UK

    https://assets.publishing.service.gov.uk/media/602bf8708fa8f50384219401/visas-summary-dec-2020-tables.xlsx">Entry clearance visas summary tables, year ending December 2020 (MS Excel Spreadsheet, 70.3 KB)

    Detailed entry clearance visas datasets

    Passenger arrivals (admissions)

    https://assets.publishing.service.gov.uk/media/602bac148fa8f5037f5d849c/passenger-arrivals-admissions-summary-dec-2020-tables.xlsx">Passenger arrivals (admissions) summary tables, year ending December 2020 (MS Excel Spreadsheet, 70.6 KB)

    Detailed Passengers initially refused entry at port datasets

    Extensions

    https://assets.publishing.service.gov.uk/media/602bac3d8fa8f50383c41f7c/extentions-summary-dec-2020-tables.xlsx">Extensions summary tables, year ending December 2020 (MS Excel Spreadsheet, 41.5 KB)

    <a href="https://www.gov.uk/governmen

  10. S

    Annual Retail Store Data, 2000 [Canada] [Excel]

    • dataverse.scholarsportal.info
    • borealisdata.ca
    • +1more
    pdf, xls
    Updated Nov 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scholars Portal Dataverse (2021). Annual Retail Store Data, 2000 [Canada] [Excel] [Dataset]. https://dataverse.scholarsportal.info/dataset.xhtml;jsessionid=1283d69ee2dd528c9011fe4a2fe3?persistentId=hdl%3A10864%2F11351&version=&q=&fileTypeGroupFacet=&fileAccess=&fileTag=%22Tables%22&fileSortField=&fileSortOrder=
    Explore at:
    xls(2165760), xls(29696), xls(2920448), pdf(76787), pdf(158404), xls(34816), xls(2754048), pdf(81084), pdf(71183), xls(34304), xls(625664), xls(2707968), xls(695808), pdf(70673), pdf(72585), xls(576512), xls(609792), xls(28672), pdf(60236), pdf(30338), pdf(87181), pdf(84140), pdf(92012), xls(610304), pdf(74439), xls(2471424), pdf(73788), xls(30208), pdf(74478), pdf(53645)Available download formats
    Dataset updated
    Nov 17, 2021
    Dataset provided by
    Scholars Portal Dataverse
    Area covered
    Canada, Canada
    Description

    The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.

  11. o

    Data for: Sustainable connectivity in a community repository

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +3more
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ted Habermann (2023). Data for: Sustainable connectivity in a community repository [Dataset]. http://doi.org/10.5061/dryad.nzs7h44xr
    Explore at:
    Dataset updated
    Dec 7, 2023
    Authors
    Ted Habermann
    Description

    Data For: Sustainable Connectivity in a Community Repository ## GENERAL INFORMATION This readme.txt file was generated on 30231110 by Ted Habermann ### Title of Dataset Data For: Sustainable Connectivity in a Community Repository ### Author Information Principal Investigator Contact Information Name: Ted Habermann (0000-0003-3585-6733) Institution: Metadata Game Changers () Email: ORCID: 0000-0003-3585-6733 ### Date published or finalized for release: November 10, 2023 ## Date of data collection (single date, range, approximate date) May and June 2023 ### Information about funding sources that supported the collection of the data: National Science Foundation (Crossref Funder ID: 100000001) Award 2134956. ### Overview of the data (abstract): These data are Dryad metadata retrieved from and translated into csv files. There are two datasets: 1. DryadJournalDataset was retrieved from Dryad using the ISSNs in the file DryadJournalDataset_ISSNs.txt, although some had no data. 2. DryadOrganizationDataset was retrieved from Dryad using the RORs in the file DryadOrganizationDataset_RORs.txt, although some had no data. Each dataset includes four types of metadata: identifiers, funders, keywords, and related works, each in a separate comma (.csv) or tab (.tsv) delimited files. There are also Microsoft Excel files (.xlsx) for the identifier metadata and connectivity summaries for each dataset (*.html). The connectivity summaries include summaries of each parameter in all four data files with definitions, counts, unique counts, most frequent values, and completeness. These data formed the basis for an analysis of the connectivity of the Dryad repository for organizations, funders, and people. | Size | FileName | | --------: | :--------------------------------------------------------- | | 90541505 | DryadJournalDataset_Identifiers_20230520_12.csv | | 9017051 | DryadJournalDataset_funders_20230520_12.tsv | | 29108477 | DryadJournalDataset_keywords_20230520_12.tsv | | 8833842 | DryadJournalDataset_relatedWorks_20230520_12.tsv | | | | | 18260935 | DryadOrganizationDataset_funders_20230601_12.tsv | | 240128730 | DryadOrganizationDataset_identifiers_20230601_12.tsv | | 39600659 | DryadOrganizationDataset_keywords_20230601_12.tsv | | 11520475 | DryadOrganizationDataset_relatedWorks_20230601_12.tsv | | | | | 40726143 | DryadJournalDataset_identifiers_20230520_12.xlsx | | 81894301 | DryadOrganizationDataset_identifiers_20230601_12.xlsx | | | | | 842827 | DryadJournalDataset_ConnectivitySummary.html | | 387551 | DryadOrganizationDataset_ConnectivitySummary.html | ### Field Definitions ## SHARING/ACCESS INFORMATION ### Licenses/restrictions placed on the data: Creative Commons Public Domain License (CC0) ### Links to publications that cite or use the data: TBD ### Was data derived from another source? No ## DATA & FILE OVERVIEW ### File List A. *Dataset_identifiers_YYYYMMDD_HH.*sv: Short description: Identifier metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. B. *Dataset_funders_YYYYMMDD_HH.*sv: Short description: Funder metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. C. *Dataset_keywords_YYYYMMDD_HH.*sv: Short description: Keyword metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. D. *Dataset_relatedWorks_YYYYMMDD_HH.*sv: Short description: Related work metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. E. *Dataset_identifiers_YYYYMMDD_HH.xlsx: Short description: Excel spreadsheet with identifier metadata from Dryad for Dataset collected at YYYYMMDD_HH using the Dryad API. F. *Dataset_ConnectivitySummary.html: Short description: Connectivity summary for Dataset. G. summarizeConnectivity.ipynb Short description: Python notebook with code for creating connectivity summaries and plots. ### Relationship between files: All files with the same dataset name make up a dataset. The .*sv are original metadata extracted from Dryad. ## METHODOLOGICAL INFORMATION ### Description of methods used for collection/generation of data: Most of the analysis is simply extracting and comparing counts of various metadata elements. ## DATA-SPECIFIC INFORMATION See connectivity summaries (*ConnectivitySummary.html) for a list of parameters in each file and summaries of their values. ### Identifier Metadata The identifier metadata datasets include the following fields: | Field | Definition | | :------------------------------- | :--------------------------------------------------------------------------------------------------- | | DOI | Digital object identifier for the dataset | | title | Title for the dataset | | datePublished | Date dataset published | | relatedPublicationISSN | International Standard Serial Number for journal with related publication | | primary_article | Digital object identifier for pr...

  12. d

    Galilee geological model 25-05-15

    • data.gov.au
    • researchdata.edu.au
    zip
    Updated Apr 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2022). Galilee geological model 25-05-15 [Dataset]. https://data.gov.au/data/dataset/bd1c35a0-52c4-421b-ac7d-651556670eb9
    Explore at:
    zip(122560650)Available download formats
    Dataset updated
    Apr 13, 2022
    Dataset authored and provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This dataset was derived by the Bioregional Assessment Programme. The parent datasets are identified in the Lineage statement in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    This dataset comprises of interpreted elevation surfaces and contours for the major Triassic and Upper Permian units of the Galilee Geological Basin.

    Purpose

    This dataset was created to provide formation extents for aquifers in the Galilee geological basin

    Dataset History

    A Quality Assurance (QA) and validation process was conducted on the original well and bore data to choose wells/bores that are within 25 kilometres of the BA Galilee Region extent.

    The QA/Validation process is as follows:

    1. Well data

      a. Obtained excel file "QPED_July_2013_galilee.xlsx" from GA

      b. Based on stratigraphic information in "BH_costrat" tab formation names were regularised and simplified based on current naming conventions.

      c. Simplified names added to QPED_July_2013_galileet.xlsx as "Steve_geo" and "Steve_group"

      d. Produced new file "GSQ_Geology.xlsx" contained decimal latitude and longitude, KB elevation, top of unit in metres from KB, top of unit in metres AHD, bottom of unit in metres from KB, bottom of unit in metres AHD, original geology, simplified geology, simplified Group geology.

       i.     KB obtained from "BH_wellhist"
      
       ii.    Where no KB information was available ie KB=0, sample the 1S DEM at the well's location to obtain height. KB=DEM+10. Marked well as having lower reliability.
      
       iii.    Calculated Top_m_AHD = KB - Top_m_KB
      
       iv.    Calculated Bottom_m_AHD = KB - Bottom_m_KB
      

      e. Brought GSQ_Geology.xlsx into ArcGIS

      f. Selected wells based on "Steve_geo" field for each model layer to produce a geodatabase for each layer.

       i.     GSQ_basement_wells
      
       ii.    GSQ_top_joe_joe_group
      
       iii.    GSQ_top_bandanna_merge
      
       iv.    GSQ_rewan_group
      
       v.     GSQ_clematis
      
       vi.    GSQ_moolyember
      

      g. Additional wells and reinterpreted tops added to appropriate geodatabase based on well completion reports

      h. Additional wells added to coverages to help model building process

       i.     Well_name listed as Fake
      
       ii.    Exception being GSQ_top_basement_fake which was created as a separate geodatabase
      
    2. Bore data

      a. Obtained QLD_DNRM_GroundwaterDatabaseExtract_20131111 from GA

      b. Used files REGISTRATIONS.txt, ELEVATIONS.txt and AQUIFER.txt to build GW_stratigraphy.xlsx

       i.     Based on RN
      
       ii.    Latitude from GIS_LAT (REGISTRATIONS.txt)
      
       iii.    Longitude from GIS_LNG (REGISTRATIONS.txt)
      
       iv.    Elevation from (ELEVATIONS.txt)
      
       v.     FORM_DESC from (AQUIFER.txt)
      
       vi.    Top from (AQUIFER.txt)
      
       vii.    Bottom from (AQUIFER.txt)
      

      c. Brought GW_stratigraphy.xlsx into ArcGIS

      d. Created gw_bores_galilee_dem

       i.     Sampled 1S DEM to obtain ground level elevation column RASTERVALU
      
       ii.    Created column top_m_AHD by RASTERVALU - Top
      

      e. Selected bores based on "FORM_DESC" field for each model layer to produce a geodatabase for each layer.

       i.     Gw_basement
      
       ii.    GW_bores_joe_joe_group
      
       iii.    GW_bores_bandanna
      
       iv.    Gw_bores_rewan
      
       v.     Gw_bores_clematis
      
       vi.    Gw_bores_moolyember
      
    3. Georectified seismic surfaces

      a. Extracted interpreted seismic surfaces for base Permian (interpreted as basement) and top Bandanna (in time) from the following seismic surveys

       i.     Y80A, W81A, Carmichael, Pendine, T81A, Quilpie, Ward and Powell Creek seismic survey downloaded https://qdexguest.deedi.qld.gov.au/portal/site/qdex/search?searchType=general 
      
       ii.    Brought TIF images into ArcGIS and georectified
      
       iii.    Digitised shape of contours and faults into geodatabase
      
           1.   Basement_contours and basement_faults
      
           2.   bandanna_contours_new_data and bandanna_faults
      
       iv.    Added field "contour" to geodatabase
      
       v.     Converted contours to depth in "contour" field based on well and bore data (top_m_AHD) and contour progression
      
       vi.    Use the shape and depth derived from OZ SEEBASE to help to add additional contours and faults to basement and bandanna datasets
      
    4. Additional contour and fault surfaces were built derived from underlying surfaces and wells/bore data

      a. Joejoe_contours and joejoe)faults

      b. Rewan_contour_clip (used bandanna_faults as fault coverage)

      c. Clematis_contour and clematis_faults

      d. Moolyember_contour (used clematis_faults as fault coverage)

    5. Surface geology

      a. Extracted surface geology from QUEENSLAND GEOLOGY_AUGUST_2012 using Galilee BA region boundary with 25 kilometre boundary to form geodatabase QLD_geology_galilee

      b. Selected relevant surface geology from QLD_geology_galilee based on field "Name" for each model layer and created new geodatabase layers

       i.     Basement_geology: Argentine Metamorphics,Running River Metamorphics,Charters Towers Metamorphics; Bimurra Volcanics, Foyle Volcanics, Mount Wyatt Formation, Saint Anns Formation, Silver Hills Volcanics, Stones Creek Volcanics; Bulliwallah Formation, Ducabrook Formation, Mount Rankin Formation, Natal Formation, Star of Hope Formation; Cape River Metamorphics; Einasleigh Metamorphics; Gem Park Granite; Macrossan Province Cambrian-Ordovician intrusives; Macrossan Province Ordovician-Silurian intrusives; Macrossan Province Ordovician intrusives; Mount Formartine, unnamed plutonic units; Pama Province Silurian-Devonian intrusives; Seventy Mile Range Group; and Kirk River beds, Les Jumelles beds.
      
       ii.    Joe_joe_geology: Joe Joe Group
      
       iii.    Galilee_permian_geology: Back Creek Group, Betts Creek Group, Blackwater Group
      
       iv.    Rewan_geology: Rewan Group
      
          1.    Later also made dunda_beds_geology to be included in Rewan model: Dunda beds
      
       v.     Clematis_geology: Clematis Group
      
          1.    Later also made warang_sandstone_geology to be included in Clematis model: Warang Sandstone
      
       vi.    Moolyember_surface_geology: Moolyember Formation
      
    6. DEM for each model layer

      a. Using surface geology geodatabase extent extract grid from dem_s_1s to represent the top of the model layer at the surface

       i.     Basement_dem
      
       ii.    Joejoe_dem
      
       iii.    Bandanna_dem
      
       iv.    Rewan_dem and dunda_dem
      
       v.     Clematis_dem and warang_dem
      
       vi.    Moolyember_surface_dem
      

      b. Used Contour tool in ArcGIS to obtain a 25 metre contour geodatabase from the relevant model DEM

       i.     Basement_dem_contours
      
       ii.    Joejoe_dem_contours
      
       iii.    Bandanna_dem_contours
      
       iv.    Rewan_dem_contours and dunda_dem_contours
      
       v.     Clematis_dem_contours and warang_dem_contours
      
       vi.    Moolyember_dem_contours
      

      c. For the purpose of guiding the model building process additional fields were added to each DEM contour geodatabase was added based on average thickness derived from groundwater bores and petroleum wells.

       i.     Basement_dem_contours: Joejoe, bandanna, rewan, clematis, moolyember
      
       ii.    Joejoe_dem_contours: basement, bandanna
      
       iii.    Bandanna_dem_contours: joejoe, rewan
      
       iv.    Rewan_dem_contours and dunda_dem_contours: clematis, rewan
      
       v.     Clematis_dem_contours and warang_dem_contours: moolyember, rewan
      
       vi.  Moolyember_dem_contours: clematis
      

    The model building process is as follows:

    1. Used the tope to raster tool to create surface based on the following rules

      a. Environment

          i.  Extent
      
             1. Top: -19.7012030024424
      
             2. Right: 148.891511819054
      
             3. Bottom: -27.5812030024424
      
             4. Left: 139.141511819054
      
          ii. Output cell size: 0.01 degrees
      
          iii. Drainage enforcement: No_enforce
      

      b. Input

          i.  Basement
      
             1. Basement_dem_contour; field - contour; type - contour
      
             2. Joejoe_dem_contour; field - basement; type - contour
      
             3. Basement_contour; field - contour; type - contour
      
             4. GSQ_basement_wells; field - top_m_AHD; type - point elevation
      
             5. GW_basement; field - top_m_AHDl type - point elevation
      
             6. GSQ_top_basement_fake; field - top_m_AHDl type - point elevation
      
             7. Basement_faults; type - cliff
      
         ii.  Joe Joe Group
      
             1. Joejoe_dem_contour; field - basement; type - contour
      
             2. Basement_dem_contour; field - joejoe; type - contour
      
             3. permian_dem_contour; field - joejoe, type - contour
      
             4. joejoe_contour; field - joejoe; type - contour
      
             5. GSQ_top_joejoe_group; field - top_m_AHD; type - point elevation
      
             6. GW_bores_joe_joe_group; field - top_m_AHDl type - point elevation
      
             7. joejoe_faults; type - cliff
      
         iii.  Bandanna Group
      
             1. Permian_dem_contour; field - contour; type - contour
      
             2. Joejoe_dem_contour; field - bandanna; type - contour
      
             3. Rewan_dem_contour: field - bandanna; type - contour
      
             4. Dunda_dem_contour; field - bandanna; type - contour
      
  13. FOI: early years dataset as at 31 March 2016

    • gov.uk
    • s3.amazonaws.com
    Updated Jul 21, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ofsted (2021). FOI: early years dataset as at 31 March 2016 [Dataset]. https://www.gov.uk/government/statistical-data-sets/foi-early-years-dataset-as-at-31-march-2016
    Explore at:
    Dataset updated
    Jul 21, 2021
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Ofsted
    Description

    There is a requirement that public authorities, like Ofsted, must publish updated versions of datasets which are disclosed as a result of Freedom of Information requests.

    Some information which is requested is exempt from disclosure to the public under the Freedom of Information Act; it is therefore not appropriate for this information to be made available. Examples of information which it is not appropriate to make available includes the locations of women’s refuges, some military bases and all children’s homes and the personal data of providers and staff. Ofsted also considers that the names and addresses of registered childminders are their personal data which it is not appropriate to make publicly available unless those individuals have given their explicit consent to do so. This information has therefore not been included in the datasets.

    Data for both childcare and childminders are included in the excel file.

    https://assets.publishing.service.gov.uk/media/60f7f6a4d3bf7f568160edb1/FOI_early_years_dataset_as_at_31_March_2016.xlsx">FOI: early years dataset as at 31 March 2016

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">16.6 MB</span></p>
    
    
    
    
     <p class="gem-c-attachment_metadata">This file may not be suitable for users of assistive technology.</p>
     <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">
    

    Request an accessible format.

      If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:enquiries@ofsted.gov.uk" target="_blank" class="govuk-link">enquiries@ofsted.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.
    

  14. B

    V-Dem v 12 dataset, all variables, trimmed to 12 countries

    • borealisdata.ca
    • search.dataone.org
    Updated Mar 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wally Seccombe (2024). V-Dem v 12 dataset, all variables, trimmed to 12 countries [Dataset]. http://doi.org/10.5683/SP3/988EOU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 27, 2024
    Dataset provided by
    Borealis
    Authors
    Wally Seccombe
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    From the massive set of V-Dem v 12 variables, we have inserted 27 in the main CPEDB dataset. Here is the entire set, organized in an excel file to match their country/year rows in the main SPSS file. This precise correspondence makes it easy to insert other variables from the V-Dem dataset into the main file where they can be statistically combined with a wide variety of variables from other sources.

  15. m

    Global Burden of Disease analysis dataset of noncommunicable disease...

    • data.mendeley.com
    Updated Apr 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Cundiff (2023). Global Burden of Disease analysis dataset of noncommunicable disease outcomes, risk factors, and SAS codes [Dataset]. http://doi.org/10.17632/g6b39zxck4.10
    Explore at:
    Dataset updated
    Apr 6, 2023
    Authors
    David Cundiff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.

    The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.

    These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis. The data include the following: 1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc). 2. A text file to import the analysis database into SAS 3. The SAS code to format the analysis database to be used for analytics 4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6 5. SAS code for deriving the multiple regression formula in Table 4. 6. SAS code for deriving the multiple regression formula in Table 5 7. SAS code for deriving the multiple regression formula in Supplementary Table 7
    8. SAS code for deriving the multiple regression formula in Supplementary Table 8 9. The Excel files that accompanied the above SAS code to produce the tables

    For questions, please email davidkcundiff@gmail.com. Thanks.

  16. m

    Sapota Fruit Datasets

    • data.mendeley.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anita Bhatt (2025). Sapota Fruit Datasets [Dataset]. http://doi.org/10.17632/jgtb95x6kf.2
    Explore at:
    Dataset updated
    Jul 11, 2025
    Authors
    Anita Bhatt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These datasets support research in quality control, classification, and visual recognition of Sapota fruit. Dataset-1 and Dataset-2 consist of images of Sapota fruit, categorized based on their visual and physical characteristics. The images are classified into two groups: spoiled fruit and fresh fruit, with the fresh fruit further sorted according to size—small, medium, and large. Dataset-1 contains images sized 224x224 pixels with a fixed white background. Dataset-2 includes images at their original size of 4000x3000 pixels, with the background removed and annotations saved in a text file. Physical parameters for the original dataset are saved in Excel files, which eventually make Dataset-1 and Dataset-2. The Dataset-3 includes images captured under various backgrounds, showcasing different sizes of Sapota fruit. Overall, this dataset aims to facilitate research in quality control, classification, and visual recognition of Sapota fruit. Dataset-1 includes images sized 224x224 pixels with a fixed white background, while Dataset-2 features images at their original size of 4000x3000 pixels, with the background removed and annotations saved in a text file. Dataset-3 contains images captured under various backgrounds, showcasing different sizes of Sapota fruit.

    Overall, these datasets are designed to support research on Sapota fruit quality.

  17. Graph Input Data Example.xlsx

    • figshare.com
    xlsx
    Updated Dec 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Corynen (2018). Graph Input Data Example.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.7506209.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 26, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Dr Corynen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The various performance criteria applied in this analysis include the probability of reaching the ultimate target, the costs, elapsed times and system vulnerability resulting from any intrusion. This Excel file contains all the logical, probabilistic and statistical data entered by a user, and required for the evaluation of the criteria. It also reports the results of all the computations.

  18. g

    Waterworks — intake point reporting

    • gimi9.com
    • data.europa.eu
    Updated Feb 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Waterworks — intake point reporting [Dataset]. https://gimi9.com/dataset/eu_https-data-norge-no-node-1499/
    Explore at:
    Dataset updated
    Feb 2, 2022
    Description

    The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. Input point_reporting. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make data for drinking water supply available to the public.

  19. g

    Delta Produce Sources Study | gimi9.com

    • gimi9.com
    Updated Feb 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Delta Produce Sources Study | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_delta-produce-sources-study-51a7a
    Explore at:
    Dataset updated
    Feb 12, 2021
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Resource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  20. f

    Cleaned NHANES 1988-2018

    • figshare.com
    txt
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    figshare
    Authors
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
Organization logo

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016

Related Article
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description

The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

Search
Clear search
Close search
Google apps
Main menu