Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Last Version: 2
Author: Francisco Rubio, Universitat Politècnia de València.
Date of data collection: 2020/06/23
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 2nd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)
Total size: 32 KB
Version 1: Description
This dataset contains a list of journals that publish data articles, code, software articles and database articles.
The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals.
Acknowledgements:
Xaquín Lores Torres for his invaluable help in preparing this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CBLEED simulation results associated to Figures 4, 5 and 6 in Autoencoder latent space sensitivity to material structure in convergent-beam low energy electron diffraction. Files are in png format and txt format with raw data corresponding to the image
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a hands-on workshop on the management of qualitative social science data, with a focus on data sharing and transparency. While the workshop addresses data management throughout the lifecycle – from data management plan to data sharing – its focus is on the particular challenges in sharing qualitative data and in making qualitative research transparent. One set of challenges concerns the ethical and legal concerns in sharing qualitative data. We will consider obtaining permissions for sharing qualitative data from human participants, strategies for (and limits of) de-identifying qualitative data, and options for restricting access to sensitive qualitative data. We will also briefly look at copyright and licensing and how they can inhibit the public sharing of qualitative data.
A second set of challenges concerns the lack of standardized guidelines for making qualitative research processes transparent. Following on some of the themes touched on in the talk, we will jointly explore some cutting edge approaches for making qualitative research transparent and discuss their potentials as well as shortcomings for different forms of research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data underlying the research of Individualised ball velocity prediction in baseball pitching based on IMU data.
Data set contains kinematics and ball velocity of baseball pitchers from the national U18 baseball team as well as six baseball academies in the Netherlands.
Feature Articles on Employment and Labour - Statistics on Job Vacancies
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Portal Project Teaching Database is a simplified version of the Portal Project Database designed for teaching. It provides a real world example of life-history, population, and ecological data, with sufficient complexity to teach many aspects of data analysis and management, but with many complexities removed to allow students to focus on the core ideas and skills being taught. The database is currently available in csv, json, and sqlite. This database is not designed for research as it intentionally removes some of the real-world complexities. The original database is published at Ecological Archives(http://esapubs.org/archive/ecol/E090/118/) and this version of the database should be used for research purposes. The Python code used for converting the original database to this teach version is included as 'create_portal_teach_dataset.py'. Suggested changes or additions to this dataset can be requested or contributed in the project GitHub repository(https://github.com/weecology/portal-teachingdb).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling.
The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly.
From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey.
Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond.
We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival.
To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values.
Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data associated with the "Online education models during COVID-19 are associated with the development and worsening of dry eye disease". The CSV file includes the subject's general information, OSDI scores and screen exposure times.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistics from the paper: Are scholarly articles disproportionately read in their own country? An analysis of Mendeley readers
by Mike Thelwall and Nabeil Maflahi
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Context: this document is designed to be along with those that are in the first edition of the new section of the Journal of Contemporary Administration (RAC): the tutorial-articles section. Objective: the purpose is to present the new section and discuss relevant topics of tutorial-articles. Method: I divide the document into three main parts. First, I provide a summary of the state of the art in open data and open code at the current date that, jointly, create the context for tutorial-articles. Second, I provide some guidance to the future of the section on tutorial-articles, providing a structure and some insights that can be developed in the future. Third, I offer a short R script to show examples of open data that, I believe, can be used in the future in tutorial-articles, but also in innovative empirical studies. Conclusion: finally, I provide a short description of the first tutorial-articles accepted for publication in this current RAC’s edition.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using the CrowdTangle API, each of the pink slime news domains was input and searched for public Facebook Page and Group posts from 2019-2024. Since the maximum number of posts possible to return is 1,000, I created a recursive function to halve the timeframe until fewer than 1,000 posts were available and then add in the posts from all the remaining time frames. Each row is a different post from a public Facebook Page or Group linking to a known pink slime website.Ads were collected via facebook.com/ads/library using the United States location and “Issues, elections or politics” ad category. Each of the ad purchasers listed above was a separate keyword that generated its own csv via the ad library. These csv's were then uploaded into a Python dataframe and concatenated into a single dataframe. Each row represents a different ad that ran on Meta's platforms and was paid for by a pink slime news parent organization.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data supporting the Springer Nature Data Availability Statement (DAS) analysis in the State of Open Data 2024. SOOD_2024_special_analysis_DAS_SN.xlsx contains the DAS, DOI, publication date, DAS categories and related country by Insitution of any author.SOOD 2024_DAS_analysis_sharing.xlsx contains the summary data by country and data sharing type.Utilizing the Dimensions database, we identified articles containing key DAS identifiers such as “Data Availability Statement” or “Availability of Data and Materials” within their full text. Digital Object Identifiers (DOIs) of these articles were collected and matched against Springer Nature’s XML database to extract the DAS for each article. The extracted DAS were categorized into specific sharing types using text and data matching terms. For statements indicating that data are publicly available in a repository, we matched against a predefined list of repository identifiers, names, and URLs. The DAS were classified into the following categories:1. Data are available from the author on request. 2. Data are included in the manuscript or its supplementary material. 3. Some or all of the data are publicly available, for example in a repository.4. Figure source data are included with the manuscript. 5. Data availability is not applicable.6. Data are declared as not available by the author.7. Data available online but not in a repository.These categories are non-exclusive: more than one can apply to any one article. Publications outside the 2019–2023 range and non-article publication types (e.g., book chapters) that were initially included in the Dimensions search results were excluded from the final dataset. Articles were included in the final analysis after applying the exclusion criteria. Upon processing, it was found that only 370 results were returned for Botswana across the five-year period; due to this low number, Botswana was not included in the DAS focused country-level analysis. This analysis does not assess the accuracy of the DAS in the context of each individual article. There was no manual verification of the categories applied; as a result, terms used out of context could have led to misclassification. Approximately 5% of articles remained unclassified following text and data matching due to these limitations.
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
The set of event logs included, are aimed to support the evaluation of the performance of process discovery algorithms. The largest event logs in this data set have millions of events. If you need even bigger datasets, you can generate these yourself using the CPN Tools sources files included (*.cpn). Each file has two parameters nofcases (i.e., the number of process instances) and nofdupl (i.e., the number of times a process is replicated with unique new names).
Ce jeu de données recense l'historique des publications mises en avant sur data.gouv.nc depuis 2019.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
General terms of use for 4TU.Centre for Research Data, for datasets where no other licence is given.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vietnam Imports from United States of Articles of Natural Cork was US$621 during 2017, according to the United Nations COMTRADE database on international trade. Vietnam Imports from United States of Articles of Natural Cork - data, historical chart and statistics - was last updated on August of 2025.
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
The sugar dataset is a multimodal hyperspectral dataset of sugar and sugar related substances.
The substances that were used for the creation of dataset are: - Sugar Ester S170 - Sugar Ester S770 - Sugar Ester S1570 - Sugar Ester P1570 - D-Mannitol - D-Sorbitol - D-Glucose - D-Galactose - D-Fructose
All of the substances were hyperspectrally recorded using different sensors, namely: - Canon EOS 70D - ASD FiledSpec 3 - Neo VNIR-1600 - Neo VNIR-1800 - Neo SWIR-320m-e - Neo SWIR-384 - Nuance Ex
The different sensors cover different wavelength ranges as well as different wavelength resolutions. This creates a unique dataset, that not only takles the question of hyperspectral classification, but also enable the research on topics like high dimensional data exploration, sensor invariant classification and dimensionality reduction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ADCP (Acoustic Doppler Current Profiler) measurements over the upper water column, measured from a frame on the bed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.