100+ datasets found

Data from: Inventory of online public databases and repositories holding...
s.cnmilf.com
agdatacommons.nal.usda.gov
+3more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, _domain-specific databases, and the top journals compare how much data is in institutional vs. _domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find _domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known _domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were _domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of _domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared _domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the _domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt
Dataset: A Systematic Literature Review on the topic of High-value datasets
zenodo.org
data.niaid.nih.gov
bin, png, txt
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova; Anastasija Nikiforova; Nina Rizun; Nina Rizun; Magdalena Ciesielska; Magdalena Ciesielska; Charalampos Alexopoulos; Charalampos Alexopoulos; Andrea Miletič; Andrea Miletič (2024). Dataset: A Systematic Literature Review on the topic of High-value datasets [Dataset]. http://doi.org/10.5281/zenodo.8075918
Explore at:
png, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8075918
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anastasija Nikiforova; Anastasija Nikiforova; Nina Rizun; Nina Rizun; Magdalena Ciesielska; Magdalena Ciesielska; Charalampos Alexopoulos; Charalampos Alexopoulos; Andrea Miletič; Andrea Miletič
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean)and Andrea Miletič (University of Zagreb)
It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.

The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.

***Methodology***

To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).

These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.

To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.

***Test procedure***
Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study.
The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx)
The data collected for each study by two researchers were then synthesized in one final version by the third researcher.

***Description of the data in this data set***

Protocol_HVD_SLR provides the structure of the protocol
Spreadsheets #1 provides the filled protocol for relevant studies.
Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies

The information on each selected study was collected in four categories:
(1) descriptive information,
(2) approach- and research design- related information,
(3) quality-related information,
(4) HVD determination-related information

Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet
2) Complete reference - the complete source information to refer to the study
3) Year of publication - the year in which the study was published
4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter}
5) DOI / Website- a link to the website where the study can be found
6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science
7) Availability in OA - availability of an article in the Open Access
8) Keywords - keywords of the paper as indicated by the authors
9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}

Approach- and research design-related information
10) Objective / RQ - the research objective / aim, established research questions
11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.)
12) Contributions - the contributions of the study
13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach?
14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared?
15) Period under investigation - period (or moment) in which the study was conducted
16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?

Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)?
18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))

HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term?
20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output")
21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description)
22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles?
23) Data - what data do HVD cover?
24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)

***Format of the file***
.xls, .csv (for the first spreadsheet only), .odt, .docx

***Licenses or restrictions***
CC-BY

For more info, see README.txt
A
‘Open Data 500 Companies’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Open Data 500 Companies’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-open-data-500-companies-b2af/2ce9feba/?iid=009-471&v=presentation
Explore at:
Dataset updated
Nov 21, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Open Data 500 Companies’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/govlab/open-data-500-companies on 12 November 2021.

--- Dataset description provided by original source is as follows ---

Context

The Open Data 500, funded by the John S. and James L. Knight Foundation (http://www.knightfoundation.org/) and conducted by the GovLab, is the first comprehensive study of U.S. companies that use open government data to generate new business and develop new products and services.

Study Goals

Provide a basis for assessing the economic value of government open data

Encourage the development of new open data companies

Foster a dialogue between government and business on how government data can be made more useful

The Govlab's Approach

The Open Data 500 study is conducted by the GovLab at New York University with funding from the John S. and James L. Knight Foundation. The GovLab works to improve people’s lives by changing how we govern, using technology-enabled solutions and a collaborative, networked approach. As part of its mission, the GovLab studies how institutions can publish the data they collect as open data so that businesses, organizations, and citizens can analyze and use this information.

Company Identification

The Open Data 500 team has compiled our list of companies through (1) outreach campaigns, (2) advice from experts and professional organizations, and (3) additional research.

Outreach Campaign

Mass email to over 3,000 contacts in the GovLab network

Mass email to over 2,000 contacts OpenDataNow.com

Blog posts on TheGovLab.org and OpenDataNow.com

Social media recommendations

Media coverage of the Open Data 500

Attending presentations and conferences

Expert Advice

Recommendations from government and non-governmental organizations

Guidance and feedback from Open Data 500 advisors

Research

Companies identified for the book, Open Data Now

Companies using datasets from Data.gov

Directory of open data companies developed by Deloitte

Online Open Data Userbase created by Socrata

General research from publicly available sources

What The Study Is Not

The Open Data 500 is not a rating or ranking of companies. It covers companies of different sizes and categories, using various kinds of data.

The Open Data 500 is not a competition, but an attempt to give a broad, inclusive view of the field.

The Open Data 500 study also does not provide a random sample for definitive statistical analysis. Since this is the first thorough scan of companies in the field, it is not yet possible to determine the exact landscape of open data companies.

--- Original source retains full ownership of the source dataset ---
s
Fostering cultures of open qualitative research: Dataset 2 – Interview...
orda.shef.ac.uk
xlsx
Updated Jun 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Hanchard; Itzel San Roman Pineda (2023). Fostering cultures of open qualitative research: Dataset 2 – Interview Transcripts [Dataset]. http://doi.org/10.15131/shef.data.23567223.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.15131/shef.data.23567223.v2
Dataset updated
Jun 28, 2023
Dataset provided by
The University of Sheffield
Authors
Matthew Hanchard; Itzel San Roman Pineda
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 23-Jun-2023 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman Institute. The dataset forms part of three outputs from a project titled ‘Fostering cultures of open qualitative research’ which ran from January 2023 to June 2023:

· Fostering cultures of open qualitative research: Dataset 1 – Survey Responses · Fostering cultures of open qualitative research: Dataset 2 – Interview Transcripts · Fostering cultures of open qualitative research: Dataset 3 – Coding Book

The project was funded with £13,913.85 of Research England monies held internally by the University of Sheffield - as part of their ‘Enhancing Research Cultures’ scheme 2022-2023.

The dataset aligns with ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee (ref: 051118) on 23-Jan-2021. This includes due concern for participant anonymity and data management.

ORDA has full permission to store this dataset and to make it open access for public re-use on the basis that no commercial gain will be made form reuse. It has been deposited under a CC-BY-NC license. Overall, this dataset comprises:

· 15 x Interview transcripts - in .docx file format which can be opened with Microsoft Word, Google Doc, or an open-source equivalent.

All participants have read and approved their transcripts and have had an opportunity to retract details should they wish to do so.

Participants chose whether to be pseudonymised or named directly. The pseudonym can be used to identify individual participant responses in the qualitative coding held within the ‘Fostering cultures of open qualitative research: Dataset 3 – Coding Book’ files.

For recruitment, 14 x participants we selected based on their responses to the project survey., whilst one participant was recruited based on specific expertise.

· 1 x Participant sheet – in .csv format which may by opened with Microsoft Excel, Google Sheet, or an open-source equivalent.

The provides socio-demographic detail on each participant alongside their main field of research and career stage. It includes a RespondentID field/column which can be used to connect interview participants with their responses to the survey questions in the accompanying ‘Fostering cultures of open qualitative research: Dataset 1 – Survey Responses’ files.

The project was undertaken by two staff:

Co-investigator: Dr. Itzel San Roman Pineda ORCiD ID: 0000-0002-3785-8057 i.sanromanpineda@sheffield.ac.uk Postdoctoral Research Assistant Labelled as ‘Researcher 1’ throughout the dataset

Principal Investigator (corresponding dataset author): Dr. Matthew Hanchard ORCiD ID: 0000-0003-2460-8638 m.s.hanchard@sheffield.ac.uk Research Associate iHuman Institute, Social Research Institutes, Faculty of Social Science Labelled as ‘Researcher 2’ throughout the dataset
Dataset 1: Studies included in literature review
catalog.data.gov
data.amerigeoss.org
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Dataset 1: Studies included in literature review [Dataset]. https://catalog.data.gov/dataset/dataset-1-studies-included-in-literature-review
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
This dataset contains the results of a literature review of experimental nutrient addition studies to determine which nutrient forms were most often measured in the scientific literature. To obtain a representative selection of relevant studies, we searched Web of Science™ using a search string to target experimental studies in artificial and natural lotic systems while limiting irrelevant papers. We screened the titles and abstracts of returned papers for relevance (experimental studies in streams/stream mesocosms that manipulated nutrients). To supplement this search, we sorted the relevant articles from the Web of Science™ search alphabetically by author and sequentially examined the bibliographies for additional relevant articles (screening titles for relevance, and then screening abstracts of potentially relevant articles) until we had obtained a total of 100 articles. If we could not find a relevant article electronically, we moved to the next article in the bibliography. Our goal was not to be completely comprehensive, but to obtain a fairly large sample of published, peer-reviewed studies from which to assess patterns. We excluded any lentic or estuarine studies from consideration and included only studies that used mesocosms mimicking stream systems (flowing water or stream water source) or that manipulated nutrient concentrations in natural streams or rivers. We excluded studies that used nutrient diffusing substrate (NDS) because these manipulate nutrients on substrates and not in the water column. We also excluded studies examining only nutrient uptake, which rely on measuring dissolved nutrient concentrations with the goal of characterizing in-stream processing (e.g., Newbold et al., 1983). From the included studies, we extracted or summarized the following information: study type, study duration, nutrient treatments, nutrients measured, inclusion of TN and/or TP response to nutrient additions, and a description of how results were reported in relation to the research-management mismatch, if it existed. Below is information on how the search was conducted: Search string used for Web of Science advanced search Search conducted on 27 September 2016. TS= (stream OR creek OR river* OR lotic OR brook OR headwater OR tributary) AND TS = (mesocosm OR flume OR "artificial stream" OR "experimental stream" OR "nutrient addition") AND TI= (nitrogen OR phosphorus OR nutrient OR enrichment OR fertilization OR eutrophication)
Student Performance & Learning Style
kaggle.com
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adil Shamim (2025). Student Performance & Learning Style [Dataset]. https://www.kaggle.com/datasets/adilshamim8/student-performance-and-learning-style/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2025
Dataset provided by
Kaggle
Authors
Adil Shamim
Description
You should not take this dataset seriously, as it is a synthetic representation based on true trends in education and career outcomes.

About the Dataset

This dataset provides insights into how different study habits, learning styles, and external factors influence student performance. It includes 10,000 records, covering details about students' study hours, online learning participation, exam scores, and other factors impacting academic success.

Dataset Features

Student_ID – Unique identifier for each student

Age – Student's age (18-30 years)

Gender – Male, Female, or Other

Study_Hours_per_Week – Hours spent studying per week (5-50 hours)

Preferred_Learning_Style – Visual, Auditory, Reading/Writing, Kinesthetic

Online_Courses_Completed – Number of online courses completed (0-20)

Participation_in_Discussions – Whether the student actively participates in discussions (Yes/No)

Assignment_Completion_Rate (%) – Percentage of assignments completed (50%-100%)

Exam_Score (%) – Student’s final exam score (40%-100%)

Attendance_Rate (%) – Percentage of classes attended (50%-100%)

Use_of_Educational_Tech – Whether the student uses educational technology (Yes/No)

Self_Reported_Stress_Level – Student’s stress level (Low, Medium, High)

Time_Spent_on_Social_Media (hours/week) – Weekly hours spent on social media (0-30 hours)

Sleep_Hours_per_Night – Average sleep duration (4-10 hours)

Final_Grade – Assigned grade based on exam score (A, B, C, D, F)

Use Cases

Predicting Student Performance – Analyze how different factors influence exam scores.

Educational Insights – Understand the impact of study habits, learning styles, and external activities.

Machine Learning Applications – Train predictive models for student success.
PLOS Open Science Indicators
plos.figshare.com
zip
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Public Library of Science (2025). PLOS Open Science Indicators [Dataset]. http://doi.org/10.6084/m9.figshare.21687686.v10
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21687686.v10
Dataset updated
Jul 10, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Public Library of Science
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains article metadata and information about Open Science Indicators for approximately 139,000 research articles published in PLOS journals from 1 January 2018 to 30 March 2025 and a set of approximately 28,000 comparator articles published in non-PLOS journals. This is the tenth release of this dataset, which will be updated with new versions on an annual basis.This version of the Open Science Indicators dataset shares the indicators seen in the previous versions as well as fully operationalised protocols and study registration indicators, which were previously only shared in preliminary forms. The v10 dataset focuses on detection of five Open Science practices by analysing the XML of published research articles:Sharing of research data, in particular data shared in data repositoriesSharing of codePosting of preprintsSharing of protocolsSharing of study registrationsThe dataset provides data and code generation and sharing rates, the location of shared data and code (whether in Supporting Information or in an online repository). It also provides preprint, protocol and study registration sharing rates as well as details of the shared output, such as publication date, URL/DOI/Registration Identifier and platform used. Additional data fields are also provided for each article analysed. This release has been run using an updated preprint detection method (see OSI-Methods-Statement_v10_Jul25.pdf for details). Further information on the methods used to collect and analyse the data can be found in Documentation.Further information on the principles and requirements for developing Open Science Indicators is available in https://doi.org/10.6084/m9.figshare.21640889.Data folders/filesData Files folderThis folder contains the main OSI dataset files PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv, which containdescriptive metadata, e.g. article title, publication data, author countries, is taken from the article .xml filesadditional information around the Open Science Indicators derived algorithmicallyand the OSI-Summary-statistics_v10_Jul25.xlsx file contains the summary data for both PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv.Documentation folderThis file contains documentation related to the main data files. The file OSI-Methods-Statement_v10_Jul25.pdf describes the methods underlying the data collection and analysis. OSI-Column-Descriptions_v10_Jul25.pdf describes the fields used in PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv. OSI-Repository-List_v1_Dec22.xlsx lists the repositories and their characteristics used to identify specific repositories in the PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv repository fields.The folder also contains documentation originally shared alongside the preliminary versions of the protocols and study registration indicators in order to give fuller details of their detection methods.Contact details for further information:Iain Hrynaszkiewicz, Director, Open Research Solutions, PLOS, ihrynaszkiewicz@plos.org / plos@plos.orgLauren Cadwallader, Open Research Manager, PLOS, lcadwallader@plos.org / plos@plos.orgAcknowledgements:Thanks to Allegra Pearce, Tim Vines, Asura Enkhbayar, Scott Kerr and parth sarin of DataSeer for contributing to data acquisition and supporting information.
o
LearnPlatform Educational Technology Engagement Dataset: Impact of COVID-19...
openicpsr.org
Updated Sep 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mary Styers (2021). LearnPlatform Educational Technology Engagement Dataset: Impact of COVID-19 on Digital Learning [Dataset]. http://doi.org/10.3886/E150042V1
Explore at:
Unique identifier
https://doi.org/10.3886/E150042V1
Dataset updated
Sep 16, 2021
Dataset provided by
LearnPlatform, Inc.
Authors
Mary Styers
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Time period covered
Jan 2020 - Dec 2020
Area covered
United States
Description
LearnPlatform is a unique technology platform in the K-12 market providing the only broadly interoperable platform to the breadth of edtech solutions in the US K12 field. A key component of edtech effectiveness is integrated reporting on tool usage and, where applicable, evidence of efficacy. With COVID closures, LearnPlatform has emerged as an important and singular resource to measure whether students are accessing digital resources within distance learning constraints. This platform provides a unique and needed source of data to understand if students are accessing digital resources, and where resources have disparate usage and impact.In this dataset we are sharing educational technology usage across the 8,000+ tools used in the education field in 2020. We make this dataset available to public so that educators, district leaders, researchers, institutions, policy-makers or anyone interested to learn about digital learning in 2020, can use this dataset to understand student engagement with core learning activities during the COVID-19 pandemic. Some example research questions that this dataset can help stakeholders answer: What is the picture of digital connectivity and engagement in 2020?What is the effect of the COVID-19 pandemic on online and distance learning, and how might this evolve in the future?How does student engagement with different types of education technology change over the course of the pandemic?How does student engagement with online learning platforms relate to different geography? Demographic context (e.g., race/ethnicity, ESL, learning disability)? Learning context? Socioeconomic status?Do certain state interventions, practices or policies (e.g., stimulus, reopening, eviction moratorium) correlate with increases or decreases in online engagement?
c
Survey data on research data management practices and perceptions of MRI...
kilthub.cmu.edu
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Borghi; Ana Van Gulick (2023). Survey data on research data management practices and perceptions of MRI researchers [Dataset]. http://doi.org/10.1184/R1/5845656.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/5845656.v1
Dataset updated
May 30, 2023
Dataset provided by
Carnegie Mellon University
Authors
John Borghi; Ana Van Gulick
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains the results of an online survey run in summer of 2017 on the research data management (RDM) practices and perceptions of researchers using magnetic resonance imaging (MRI) to study human neuroscience (N=144). The dataset includes responses to multiple choice questions ordered roughly according the phases of a typical research project including data collection, analysis, and sharing. It focuses on a range of RDM topics, including the type of data collected, software and tools used to analyze and manage data, and the degree to which data management practices are standardized within a research group. It also includes participant ratings on the maturity of their data management practices and those of the field at large on a 1-5 scale from ad hoc to refined and responses about perceptions of new scholarly communications practices including data sharing, data reuse, and Open Access publishing.The survey instrument used can be found at the link in the reference below.
a
HJA Online Studies Map
data-osugisci.opendata.arcgis.com
Updated Apr 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oregon State University GISci (2019). HJA Online Studies Map [Dataset]. https://data-osugisci.opendata.arcgis.com/datasets/hja-online-studies-map/about
Explore at:
Dataset updated
Apr 2, 2019
Dataset authored and provided by
Oregon State University GISci
Description
H.J. Andrews Online Studies Map is a compilation of study site locations and pertinent GIS layers for the geographic area of H.J. Andrews Experimental Forest (LTER). For questions about content contact the HJ Andrews spatial data administrator via the contacts link at this page: https://andrewsforest.oregonstate.edu/data. Some individual data sets are available at the HJ Andrews Open Data Hub here: http://data-osugisci.opendata.arcgis.com/.
Global Government Open Data Management Platform Market Size By Product Type...
verifiedmarketresearch.com
Updated Nov 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2024). Global Government Open Data Management Platform Market Size By Product Type (On Premise, Cloud Based), By Application (Public, Private), By Organization Type (Large Enterprise, SMES), By Geographic Scope and Forecast, By Geographic Scope and Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/government-open-data-management-platform-market/
Explore at:
Dataset updated
Nov 15, 2024
Dataset provided by
Verified Market Researchhttps://www.verifiedmarketresearch.com/
Authors
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2026 - 2032
Area covered
Global
Description
Global Government Open Data Management Platform Market size was valued at USD 1.75 Billion in 2024 and is projected to reach USD 3.38 Billion by 2032, growing at a CAGR of 8.54% from 2026 to 2032.

Global Government Open Data Management Platform Market Drivers

Increasing Demand for Transparency and Accountability: There is a growing public demand for transparency in government operations, which drives the adoption of open data initiatives. According to a survey by the World Bank, 85% of respondents in various countries indicated that transparency in government decisions is crucial for reducing corruption, prompting governments to implement open data platforms.

Technological Advancements: Rapid advancements in information and communication technology (ICT) facilitate the development and deployment of open data management platforms. The International Telecommunication Union (ITU) reported that global Internet penetration reached approximately 64% in 2023, enabling more citizens to access open data and engage with government services online.

Government Initiatives and Policies: Many governments are actively promoting open data through policies and initiatives. For instance, the U.S. government's Open Data Initiative, launched in 2013, has led to the publication of over 300,000 datasets on Data.gov. Additionally, the European Union's Open Data Directive, which aims to make public sector data available, is further encouraging governments to embrace open data practices.
Z
Dataset: maturity of transparency of open data ecosystems in 22 smart cities...
data.niaid.nih.gov
zenodo.org
Updated Apr 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova (2022). Dataset: maturity of transparency of open data ecosystems in 22 smart cities [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6497068
Explore at:
Dataset updated
Apr 27, 2022
Dataset provided by
Martin Lnenicka
Anastasija Nikiforova
Mariusz Luterek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study "Transparency of open data ecosystems in smart cities: Definition and assessment of the maturity of transparency in 22 smart cities" (Sustainable Cities and Society (SCS), vol.82, 103906) conducted by Martin Lnenicka (University of Pardubice), Anastasija Nikiforova (University of Tartu), Mariusz Luterek (University of Warsaw), Otmane Azeroual (German Centre for Higher Education Research and Science Studies), Dandison Ukpabi (University of Jyväskylä), Visvaldis Valtenbergs (University of Latvia), Renata Machova (University of Pardubice).

This study inspects smart cities’ data portals and assesses their compliance with transparency requirements for open (government) data by means of the expert assessment of 34 portals representing 22 smart cities, with 36 features.

It being made public both to act as supplementary data for the paper and in order for other researchers to use these data in their own work potentially contributing to the improvement of current data ecosystems and build sustainable, transparent, citizen-centered, and socially resilient open data-driven smart cities.

Purpose of the expert assessment The data in this dataset were collected in the result of the applying the developed benchmarking framework for assessing the compliance of open (government) data portals with the principles of transparency-by-design proposed by Lněnička and Nikiforova (2021)* to 34 portals that can be considered to be part of open data ecosystems in smart cities, thereby carrying out their assessment by experts in 36 features context, which allows to rank them and discuss their maturity levels and (4) based on the results of the assessment, defining the components and unique models that form the open data ecosystem in the smart city context.

Methodology Sample selection: the capitals of the Member States of the European Union and countries of the European Economic Area were selected to ensure a more coherent political and legal framework. They were mapped/cross-referenced with their rank in 5 smart city rankings: IESE Cities in Motion Index, Top 50 smart city governments (SCG), IMD smart city index (SCI), global cities index (GCI), and sustainable cities index (SCI). A purposive sampling method and systematic search for portals was then carried out to identify relevant websites for each city using two complementary techniques: browsing and searching. To evaluate the transparency maturity of data ecosystems in smart cities, we have used the transparency-by-design framework (Lněnička & Nikiforova, 2021)*. The benchmarking supposes the collection of quantitative data, which makes this task an acceptability task. A six-point Likert scale was applied for evaluating the portals. Each sub-dimension was supplied with its description to ensure the common understanding, a drop-down list to select the level at which the respondent (dis)agree, and a comment to be provided, which has not been mandatory. This formed a protocol to be fulfilled on every portal. Each sub-dimension/feature was assessed using a six-point Likert scale, where strong agreement is assessed with 6 points, while strong disagreement is represented by 1 point. Each website (portal) was evaluated by experts, where a person is considered to be an expert if a person works with open (government) data and data portals daily, i.e., it is the key part of their job, which can be public officials, researchers, and independent organizations. In other words, compliance with the expert profile according to the International Certification of Digital Literacy (ICDL) and its derivation proposed in Lněnička et al. (2021)* is expected to be met. When all individual protocols were collected, mean values and standard deviations (SD) were calculated, and if statistical contradictions/inconsistencies were found, reassessment took place to ensure individual consistency and interrater reliability among experts’ answers. *Lnenicka, M., & Nikiforova, A. (2021). Transparency-by-design: What is the role of open data portals?. Telematics and Informatics, 61, 101605 *Lněnička, M., Machova, R., Volejníková, J., Linhartová, V., Knezackova, R., & Hub, M. (2021). Enhancing transparency through open government data: the case of data portals and their features and capabilities. Online Information Review.

Test procedure (1) perform an assessment of each dimension using sub-dimensions, mapping out the achievement of each indicator (2) all sub-dimensions in one dimension are aggregated, and then the average value is calculated based on the number of sub-dimensions – the resulting average stands for a dimension value - eight values per portal (3) the average value from all dimensions are calculated and then mapped to the maturity level – this value of each portal is also used to rank the portals.

Description of the data in this data set Sheet#1 "comparison_overall" provides results by portal Sheet#2 "comparison_category" provides results by portal and category Sheet#3 "category_subcategory" provides list of categories and its elements

Format of the file .xls

Licenses or restrictions CC-BY

For more info, see README.txt
Data from: Bibliographic dataset characterizing studies that use online...
zenodo.org
portalcientifico.unav.edu
bin, csv
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joan E. Ball-Damerow; Joan E. Ball-Damerow; Laura Brenskelle; Laura Brenskelle; Narayani Barve; Narayani Barve; Raphael LaFrance; Pamela S. Soltis; Petra Sierwald; Petra Sierwald; Rüdiger Bieler; Rüdiger Bieler; Arturo Ariño; Arturo Ariño; Robert Guralnick; Robert Guralnick; Raphael LaFrance; Pamela S. Soltis (2020). Bibliographic dataset characterizing studies that use online biodiversity databases [Dataset]. http://doi.org/10.5281/zenodo.2589439
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2589439
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Joan E. Ball-Damerow; Joan E. Ball-Damerow; Laura Brenskelle; Laura Brenskelle; Narayani Barve; Narayani Barve; Raphael LaFrance; Pamela S. Soltis; Petra Sierwald; Petra Sierwald; Rüdiger Bieler; Rüdiger Bieler; Arturo Ariño; Arturo Ariño; Robert Guralnick; Robert Guralnick; Raphael LaFrance; Pamela S. Soltis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes bibliographic information for 501 papers that were published from 2010-April 2017 (time of search) and use online biodiversity databases for research purposes. Our overarching goal in this study is to determine how research uses of biodiversity data developed during a time of unprecedented growth of online data resources. We also determine uses with the highest number of citations, how online occurrence data are linked to other data types, and if/how data quality is addressed. Specifically, we address the following questions:

1.) What primary biodiversity databases have been cited in published research, and which

databases have been cited most often?

2.) Is the biodiversity research community citing databases appropriately, and are

the cited databases currently accessible online?

3.) What are the most common uses, general taxa addressed, and data linkages, and how

have they changed over time?

4.) What uses have the highest impact, as measured through the mean number of citations

per year?

5.) Are certain uses applied more often for plants/invertebrates/vertebrates?

6.) Are links to specific data types associated more often with particular uses?

7.) How often are major data quality issues addressed?

8.) What data quality issues tend to be addressed for the top uses?

Relevant papers for this analysis include those that use online and openly accessible primary occurrence records, or those that add data to an online database. Google Scholar (GS) provides full-text indexing, which was important to identify data sources that often appear buried in the methods section of a paper. Our search was therefore restricted to GS. All authors discussed and agreed upon representative search terms, which were relatively broad to capture a variety of databases hosting primary occurrence records. The terms included: “species occurrence” database (8,800 results), “natural history collection” database (634 results), herbarium database (16,500 results), “biodiversity database” (3,350 results), “primary biodiversity data” database (483 results), “museum collection” database (4,480 results), “digital accessible information” database (10 results), and “digital accessible knowledge” database (52 results)--note that quotations are used as part of the search terms where specific phrases are needed in whole. We downloaded all records returned by each search (or the first 500 if there were more) into a Zotero reference management database. About one third of the 2500 papers in the final dataset were relevant. Three of the authors with specialized knowledge of the field characterized relevant papers using a standardized tagging protocol based on a series of key topics of interest. We developed a list of potential tags and descriptions for each topic, including: database(s) used, database accessibility, scale of study, region of study, taxa addressed, research use of data, other data types linked to species occurrence data, data quality issues addressed, authors, institutions, and funding sources. Each tagged paper was thoroughly checked by a second tagger.

The final dataset of tagged papers allow us to quantify general areas of research made possible by the expansion of online species occurrence databases, and trends over time. Analyses of this data will be published in a separate quantitative review.
c
The Home of the U.S. Government's Open Data Here you will find data, tools,...
catalog.civicdataecosystem.org
Updated Aug 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset updated
Aug 30, 2025
Description
The Home of the U.S. Government's Open Data Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more.
d
Dataset with determinants or factors influencing graduate economics student...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Nov 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zurika Robinson; Thea Uys (2023). Dataset with determinants or factors influencing graduate economics student preparation and success in an online environment [Dataset]. http://doi.org/10.5061/dryad.bvq83bkgd
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.bvq83bkgd
Dataset updated
Nov 3, 2023
Dataset provided by
Dryad Digital Repository
Authors
Zurika Robinson; Thea Uys
Time period covered
Jan 1, 2023
Description
The data relates to the paper that analyses the determinants or factors that best explain student research skills and success in the honours research report module during the COVID-19 pandemic in 2021. The data used have been gathered through an online survey created on the Qualtrics software package. The research questions were developed from demographic factors and subject knowledge including assignments to supervisor influence and other factors in terms of experience or belonging that played a role (see anonymous link atÂ https://unisa.qualtrics.com/jfe/form/SV_86OZZOdyA5sBurY. An SMS was sent to all students of the 2021 module group to make them aware of the survey. They were under no obligation to complete it and all information was regarded as anonymous. We received 39 responses. The raw data from the survey was processed through the SPSS statistical, software package. The data file contains the demographics, frequencies, descriptives, and open questions processed. Â Â Â Â The study...
CDC WONDER API for Data Query Web Service
catalog.data.gov
odgavaprod.ogopendata.com
+3more
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention, Department of Health & Human Services (2023). CDC WONDER API for Data Query Web Service [Dataset]. https://catalog.data.gov/dataset/wide-ranging-online-data-for-epidemiologic-research-wonder
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
WONDER online databases include county-level Compressed Mortality (death certificates) since 1979; county-level Multiple Cause of Death (death certificates) since 1999; county-level Natality (birth certificates) since 1995; county-level Linked Birth / Death records (linked birth-death certificates) since 1995; state & large metro-level United States Cancer Statistics mortality (death certificates) since 1999; state & large metro-level United States Cancer Statistics incidence (cancer registry cases) since 1999; state and metro-level Online Tuberculosis Information System (TB case reports) since 1993; state-level Sexually Transmitted Disease Morbidity (case reports) since 1984; state-level Vaccine Adverse Event Reporting system (adverse reaction case reports) since 1990; county-level population estimates since 1970. The WONDER web server also hosts the Data2010 system with state-level data for compliance with Healthy People 2010 goals since 1998; the National Notifiable Disease Surveillance System weekly provisional case reports since 1996; the 122 Cities Mortality Reporting System weekly death reports since 1996; the Prevention Guidelines database (book in electronic format) published 1998; the Scientific Data Archives (public use data sets and documentation); and links to other online data sources on the "Topics" page.
United States Federal Government Open Data Portal
data.pa.gov
csv, xlsx, xml
Updated Jul 6, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Federal Government (2018). United States Federal Government Open Data Portal [Dataset]. https://data.pa.gov/Local-Government/United-States-Federal-Government-Open-Data-Portal/6pts-mmcx
Explore at:
xlsx, xml, csvAvailable download formats
Dataset updated
Jul 6, 2018
Dataset provided by
Federal government of the United Stateshttp://www.usa.gov/
Authors
United States Federal Government
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Area covered
United States
Description
This is a link to the United States Federal Government's Open Data Portal. Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations.

Check out the attachment in the metadata detailing all the Opioid Related datasets contained in this portal.

Data.gov is the federal government’s open data site, and aims to make government more open and accountable. Opening government data increases citizen participation in government, creates opportunities for economic development, and informs decision making in both the private and public sectors.

Links included for Center for Disease Control and Prevention both the business website and their Data and Statistics website.
i
A Dataset on Online Learning-based Web Behavior from Different Countries...
ieee-dataport.org
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saumick Pradhan (2025). A Dataset on Online Learning-based Web Behavior from Different Countries Before and After COVID-19 [Dataset]. https://ieee-dataport.org/open-access/dataset-online-learning-based-web-behavior-different-countries-and-after-covid-19
Explore at:
Dataset updated
Jul 29, 2025
Authors
Saumick Pradhan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
2022
Map of articles about "Teaching Open Science"
zenodo.org
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabel Steinhardt; Isabel Steinhardt (2020). Map of articles about "Teaching Open Science" [Dataset]. http://doi.org/10.5281/zenodo.3371415
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3371415
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Isabel Steinhardt; Isabel Steinhardt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This description is part of the blog post "Systematic Literature Review of teaching Open Science" https://sozmethode.hypotheses.org/839

According to my opinion, we do not pay enough attention to teaching Open Science in higher education. Therefore, I designed a seminar to teach students the practices of Open Science by doing qualitative research.About this seminar, I wrote the article ”Teaching Open Science and qualitative methods“. For the article ”Teaching Open Science and qualitative methods“, I started to review the literature on ”Teaching Open Science“. The result of my literature review is that certain aspects of Open Science are used for teaching. However, Open Science with all its aspects (Open Access, Open Data, Open Methodology, Open Science Evaluation and Open Science Tools) is not an issue in publications about teaching.

Based on this insight, I have started a systematic literature review. I realized quickly that I need help to analyse and interpret the articles and to evaluate my preliminary findings. Especially different disciplinary cultures of teaching different aspects of Open Science are challenging, as I myself, as a social scientist, do not have enough insight to be able to interpret the results correctly. Therefore, I would like to invite you to participate in this research project!

I am now looking for people who would like to join a collaborative process to further explore and write the systematic literature review on “Teaching Open Science“. Because I want to turn this project into a Massive Open Online Paper (MOOP). According to the 10 rules of Tennant et al (2019) on MOOPs, it is crucial to find a core group that is enthusiastic about the topic. Therefore, I am looking for people who are interested in creating the structure of the paper and writing the paper together with me. I am also looking for people who want to search for and review literature or evaluate the literature I have already found. Together with the interested persons I would then define, the rules for the project (cf. Tennant et al. 2019). So if you are interested to contribute to the further search for articles and / or to enhance the interpretation and writing of results, please get in touch. For everyone interested to contribute, the list of articles collected so far is freely accessible at Zotero: https://www.zotero.org/groups/2359061/teaching_open_science. The figure shown below provides a first overview of my ongoing work. I created the figure with the free software yEd and uploaded the file to zenodo, so everyone can download and work with it:

To make transparent what I have done so far, I will first introduce what a systematic literature review is. Secondly, I describe the decisions I made to start with the systematic literature review. Third, I present the preliminary results.

Systematic literature review – an Introduction

Systematic literature reviews “are a method of mapping out areas of uncertainty, and identifying where little or no relevant research has been done.” (Petticrew/Roberts 2008: 2). Fink defines the systematic literature review as a “systemic, explicit, and reproducible method for identifying, evaluating, and synthesizing the existing body of completed and recorded work produced by researchers, scholars, and practitioners.” (Fink 2019: 6). The aim of a systematic literature reviews is to surpass the subjectivity of a researchers’ search for literature. However, there can never be an objective selection of articles. This is because the researcher has for example already made a preselection by deciding about search strings, for example “Teaching Open Science”. In this respect, transparency is the core criteria for a high-quality review.

In order to achieve high quality and transparency, Fink (2019: 6-7) proposes the following seven steps:

Selecting a research question.

Selecting the bibliographic database.

Choosing the search terms.

Applying practical screening criteria.

Applying methodological screening criteria.

Doing the review.

Synthesizing the results.

I have adapted these steps for the “Teaching Open Science” systematic literature review. In the following, I will present the decisions I have made.

Systematic literature review – decisions I made

Research question: I am interested in the following research questions: How is Open Science taught in higher education? Is Open Science taught in its full range with all aspects like Open Access, Open Data, Open Methodology, Open Science Evaluation and Open Science Tools? Which aspects are taught? Are there disciplinary differences as to which aspects are taught and, if so, why are there such differences?

Databases: I started my search at the Directory of Open Science (DOAJ). “DOAJ is a community-curated online directory that indexes and provides access to high quality, open access, peer-reviewed journals.” (https://doaj.org/) Secondly, I used the Bielefeld Academic Search Engine (base). Base is operated by Bielefeld University Library and “one of the world’s most voluminous search engines especially for academic web resources” (base-search.net). Both platforms are non-commercial and focus on Open Access publications and thus differ from the commercial publication databases, such as Web of Science and Scopus. For this project, I deliberately decided against commercial providers and the restriction of search in indexed journals. Thus, because my explicit aim was to find articles that are open in the context of Open Science.

Search terms: To identify articles about teaching Open Science I used the following search strings: “teaching open science” OR teaching “open science” OR teach „open science“. The topic search looked for the search strings in title, abstract and keywords of articles. Since these are very narrow search terms, I decided to broaden the method. I searched in the reference lists of all articles that appear from this search for further relevant literature. Using Google Scholar I checked which other authors cited the articles in the sample. If the so checked articles met my methodological criteria, I included them in the sample and looked through the reference lists and citations at Google Scholar. This process has not yet been completed.

Practical screening criteria: I have included English and German articles in the sample, as I speak these languages (articles in other languages are very welcome, if there are people who can interpret them!). In the sample only journal articles, articles in edited volumes, working papers and conference papers from proceedings were included. I checked whether the journals were predatory journals – such articles were not included. I did not include blogposts, books or articles from newspapers. I only included articles that fulltexts are accessible via my institution (University of Kassel). As a result, recently published articles at Elsevier could not be included because of the special situation in Germany regarding the Project DEAL (https://www.projekt-deal.de/about-deal/). For articles that are not freely accessible, I have checked whether there is an accessible version in a repository or whether preprint is available. If this was not the case, the article was not included. I started the analysis in May 2019.

Methodological criteria: The method described above to check the reference lists has the problem of subjectivity. Therefore, I hope that other people will be interested in this project and evaluate my decisions. I have used the following criteria as the basis for my decisions: First, the articles must focus on teaching. For example, this means that articles must describe how a course was designed and carried out. Second, at least one aspect of Open Science has to be addressed. The aspects can be very diverse (FOSS, repositories, wiki, data management, etc.) but have to comply with the principles of openness. This means, for example, I included an article when it deals with the use of FOSS in class and addresses the aspects of openness of FOSS. I did not include articles when the authors describe the use of a particular free and open source software for teaching but did not address the principles of openness or re-use.

Doing the review: Due to the methodical approach of going through the reference lists, it is possible to create a map of how the articles relate to each other. This results in thematic clusters and connections between clusters. The starting point for the map were four articles (Cook et al. 2018; Marsden, Thompson, and Plonsky 2017; Petras et al. 2015; Toelch and Ostwald 2018) that I found using the databases and criteria described above. I used yEd to generate the network. „yEd is a powerful desktop application that can be used to quickly and effectively generate high-quality diagrams.” (https://www.yworks.com/products/yed) In the network, arrows show, which articles are cited in an article and which articles are cited by others as well. In addition, I made an initial rough classification of the content using colours. This classification is based on the contents mentioned in the articles’ title and abstract. This rough content classification requires a more exact, i.e., content-based subdivision and
e
Dataset for: Same Question, Different Answers? An Empirical Comparison of...
b2find.eudat.eu
Updated Aug 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Dataset for: Same Question, Different Answers? An Empirical Comparison of Web Data and Traditional Data - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/aa5eb0c9-80f7-57ff-9c4e-3a3fcc32b966
Explore at:
Dataset updated
Aug 7, 2025
Description
Psychological scientists increasingly study web data, such as user ratings or social media postings. However, whether research relying on such web data leads to the same conclusions as research based on traditional data is largely unknown. To test this, we (re)analyzed three datasets, thereby comparing web data with lab and online survey data. We calculated correlations across these different datasets (Study 1) and investigated identical, illustrative research questions in each dataset (Studies 2 to 4). Our results suggest that web and traditional data are not fundamentally different and usually lead to similar conclusions, but also that it is important to consider differences between data types such as populations and research settings. Web data can be a valuable tool for psychologists when accounting for such differences, as it allows for testing established research findings in new contexts, complementing them with insights from novel data sources.

Facebook

Twitter

Click to copy link

Link copied

Cite

Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81

Data from: Inventory of online public databases and repositories holding agricultural data in 2017

Explore at:

Dataset updated

Apr 21, 2025

Dataset provided by

Agricultural Research Servicehttps://www.ars.usda.gov/

Description

United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, _domain-specific databases, and the top journals compare how much data is in institutional vs. _domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find _domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known _domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were _domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of _domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared _domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the _domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

Clear search

Close search

Google apps

Main menu

Data from: Inventory of online public databases and repositories holding...

Dataset: A Systematic Literature Review on the topic of High-value datasets

‘Open Data 500 Companies’ analyzed by Analyst-2

Context

Study Goals

The Govlab's Approach

Company Identification

What The Study Is Not

Fostering cultures of open qualitative research: Dataset 2 – Interview...

Dataset 1: Studies included in literature review

Student Performance & Learning Style

About the Dataset

Dataset Features

Use Cases

PLOS Open Science Indicators

LearnPlatform Educational Technology Engagement Dataset: Impact of COVID-19...

Survey data on research data management practices and perceptions of MRI...

HJA Online Studies Map

Global Government Open Data Management Platform Market Size By Product Type...

Dataset: maturity of transparency of open data ecosystems in 22 smart cities...

Data from: Bibliographic dataset characterizing studies that use online...

The Home of the U.S. Government's Open Data Here you will find data, tools,...

Dataset with determinants or factors influencing graduate economics student...

CDC WONDER API for Data Query Web Service

United States Federal Government Open Data Portal

A Dataset on Online Learning-based Web Behavior from Different Countries...

Map of articles about "Teaching Open Science"

Dataset for: Same Question, Different Answers? An Empirical Comparison of...

Data from: Inventory of online public databases and repositories holding agricultural data in 2017See More Versions

Data from: Inventory of online public databases and repositories holding agricultural data in 2017