CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset underpins research undertaken by the Data Publishing team at Springer Nature which analysed the impact of Data Availability Statements on Nature journal editors, and how researchers choose to share their data.Mandatory Data Availability Statements were introduced by Nature journals in 2016 which require researchers to state how their data can be accessed.The dataset comprises of a single Excel file, which include the journal title, unique ID for each published article, subject areas, and the estimated time required to include a Data Availability Statement as reported by the journals' editorial staff. The median time per journal is also calculated.The full text of the Data Availability statement is included, and the statements are coded according to the data sharing method described.This dataset supports a paper that has been peer reviewed and accepted for presentation at the International Digital Curation Conference 2018. The paper has been submitted to the International Journal of Digital Curation. At the time of dataset release the full paper is available as a preprint in BioRxiv.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)
A data availability statement (DAS) is part of a research manuscript that contains information about where the raw data from the study can be accessed. Many journals do not require authors to write a DAS, and then most authors will not include such a statement [1]. In journals that require authors to write DAS in their manuscripts, most authors write in the DAS that their data is available on request from the corresponding authors. However, it has been shared that the overwhelming majority of those corresponding authors do not even respond to a message with the data request, and very few share their data [2].
Other than genuinely not wanting to share their data, other potential reasons for not even answering such messages are that the message requesting data ended up in a spam folder, that they are too busy, or that other team member(s) hold the raw data.
The aim of this study is to assess whether more raw data can be accessed if the data sharing request is sent to all authors versus only requesting data from the corresponding author.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Availability Statement for Article
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ionic wind has shown promising applications in many fields
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data from field work (crops vegetation in the urban gardens) and the data used for the development of the ANNs, which support the reported results
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Marine ecological restoration
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For this dataset, scientific peer-reviewed articles by Tampere University researchers from the years 2020 and 2021 were extracted from the TUNICRIS. A random sample of 40 percent was taken from the listed 4,922 publications according to faculties and years. There were 2,085 analyzed articles, i.e. more than 42 percent of the total number.
To find Data Availability Statements, articles were opened one by one and searched for mentions of research data and its availability. For each article, it was written down whether DAS existed and where in the article it was located. From the contents of DAS, information about data availability, location, openness and possible restrictions on use was written down.
Dataset also includes information about the journals and publications taken from TUNICRIS.
The prevalence of DAS and data openness were examined in relation to different variables. Tampere University faculty information has been removed from the dataset.
Related slides: https://doi.org/10.5281/zenodo.7655892
Related article (in Finnish): Toikko, T., & Kylmälä, K. (2023). Tutkimusdatan saatavuustiedot tieteellisissä artikkeleissa: Raportti Data Availability Statementien käytöstä Tampereen yliopistossa. Informaatiotutkimus, 42(1-2), 31–50. https://doi.org/10.23978/inf.126098
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Categories used to classify the data availability statements.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Two examples of input data from Universities of Glasgow and Leicester for UKRN led Open Research Indicators pilot 4. The overall aim of the pilot was to explore the co-creation of practical methods to monitor the prevalence of DAS in research articles and assess the quality of DAS and their usefulness.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
I have research data associated with this article, and I have included a Data Availability Statement with details on how to access the data in my manuscript file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and code associated with "The Observed Availability of Data and Code in Earth Science
and Artificial Intelligence" by Erin A. Jones, Brandon McClung, Hadi Fawad, and Amy McGovern.
Instructions: To reproduce figures, download all associated Python and CSV files and place
in a single directory.
Run BAMS_plot.py as you would run Python code on your system.
Code:
BAMS_plot.py: Python code for categorizing data availability statements based on given data
documented below and creating figures 1-3.
Code was originally developed for Python 3.11.7 and run in the Spyder
(version 5.4.3) IDE.
Libraries utilized:
numpy (version 1.26.4)
pandas (version 2.1.4)
matplotlib (version 3.8.0)
For additional documentation, please see code file.
Data:
ASDC_AIES.csv: CSV file containing relevant availability statement data for Artificial
Intelligence for the Earth Systems (AIES)
ASDC_AI_in_Geo.csv: CSV file containing relevant availability statement data for Artificial
Intelligence in Geosciences (AI in Geo.)
ASDC_AIJ.csv: CSV file containing relevant availability statement data for Artificial
Intelligence (AIJ)
ASDC_MWR.csv: CSV file containing relevant availability statement data for Monthly
Weather Review (MWR)
Data documentation:
All CSV files contain the same format of information for each journal. The CSV files above are
needed for the BAMS_plot.py code attached.
Records were analyzed based on the criteria below.
Records:
1) Title of paper
The title of the examined journal article.
2) Article DOI (or URL)
A link to the examined journal article. For AIES, AI in Geo., MWR, the DOI is
generally given. For AIJ, the URL is given.
3) Journal name
The name of the journal where the examined article is published. Either a full
journal name (e.g., Monthly Weather Review), or the acronym used in the
associated paper (e.g., AIES) is used.
4) Year of publication
The year the article was posted online/in print.
5) Is there an ASDC?
If the article contains an availability statement in any form, "yes" is
recorded. Otherwise, "no" is recorded.
6) Justification for non-open data?
If an availability statement contains some justification for why data is not
openly available, the justification is summarized and recorded as one of the
following options: 1) Dataset too large, 2) Licensing/Proprietary, 3) Can be
obtained from other entities, 4) Sensitive information, 5) Available at later
date. If the statement indicates any data is not openly available and no
justification is provided, or if no statement is provided is provided "None"
is recorded. If the statement indicates openly available data or no data
produced, "N/A" is recorded.
7) All data available
If there is an availability statement and data is produced, "y" is recorded
if means to access data associated with the article are given and there is no
indication that any data is not openly available; "n" is recorded if no means
to access data are given or there is some indication that some or all data is
not openly available. If there is no availability statement or no data is
produced, the record is left blank.
8) At least some data available
If there is an availability statement and data is produced, "y" is recorded
if any means to access data associated with the article are given; "n" is
recorded if no means to access data are given. If there is no availability
statement or no data is produced, the record is left blank.
9) All code available
If there is an availability statement and data is produced, "y" is recorded
if means to access code associated with the article are given and there is no
indication that any code is not openly available; "n" is recorded if no means
to access code are given or there is some indication that some or all code is
not openly available. If there is no availability statement or no data is
produced, the record is left blank.
10) At least some code available
If there is an availability statement and data is produced, "y" is recorded
if any means to access code associated with the article are given; "n" is
recorded if no means to access code are given. If there is no availability
statement or no data is produced, the record is left blank.
11) All data available upon request
If there is an availability statement indicating data is produced and no data
is openly available, "y" is recorded if any data is available upon request to
the authors of the examined journal article (not a request to any other
entity); "n" is recorded if no data is available upon request to the authors
of the examined journal article. If there is no availability statement, any
data is openly available, or no data is produced, the record is left blank.
12) At least some data available upon request
If there is an availability statement indicating data is produced and not all
data is openly available, "y" is recorded if all data is available upon
request to the authors of the examined journal article (not a request to any
other entity); "n" is recorded if not all data is available upon request to
the authors of the examined journal article. If there is no availability
statement, all data is openly available, or no data is produced, the record
is left blank.
13) no data produced
If there is an availability statement that indicates that no data was
produced for the examined journal article, "y" is recorded. Otherwise, the
record is left blank.
14) links work
If the availability statement contains one or more links to a data or code
repository, "y" is recorded if all links work; "n" is recorded if one or more
links do not work. If there is no availability statement or the statement
does not contain any links to a data or code repository, the record is left
blank.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains Data Availability Statements from 47,593 papers published in PLOS ONE between March 2014 (when the policy went into effect) and May 2016, analyzed for type of statement.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a Data Availability Statement for a paper results being published.
The DIAMAS project investigates Institutional Publishing Service Providers (IPSP) in the broadest sense, with a special focus on those publishing initiatives that do not charge fees to authors or readers. To collect information on Institutional Publishing in the ERA, a survey was conducted among IPSPs between March-May 2024. This dataset contains aggregated data from the 685 valid responses to the DIAMAS survey on Institutional Publishing. The dataset supplements D2.3 Final IPSP landscape Report Institutional Publishing in the ERA: results from the DIAMAS survey. The data Basic aggregate tabular data Full individual survey responses are not being shared to prevent the easy identification of respondents (in line with conditions set out in the survey questionnaire). This dataset contains full tables with aggregate data for all questions from the survey, with the exception of free-text responses, from all 685 survey respondents. This includes, per question, overall totals and percentages for the answers given as well the breakdown by both IPSP-types: institutional publishers (IPs) and service providers (SPs). Tables at country level have not been shared, as cell values often turned out to be too low to prevent potential identification of respondents. The data is available in csv and docx formats, with csv files grouped and packaged into ZIP files. Metadata describing data type, question type, as well as question response rate, is available in csv format. The R code used to generate the aggregate tables is made available as well. Files included in this dataset survey_questions_data_description.csv - metadata describing data type, question type, as well as question response rate per survey question. tables_raw_all.zip - raw tables (csv format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. Zip file contains 180 csv files. tables_raw_IP.zip - as tables_raw_all.zip, for responses from institutional publishers (IP) only. Zip file contains 180 csv files. tables_raw_SP.zip - as tables_raw_all.zip, for responses from service providers (SP) only. Zip file contains 170 csv files. tables_formatted_all.docx - formatted tables (docx format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. tables_formatted_IP.docx - as tables_formatted_all.docx, for responses from institutional publishers (IP) only. tables_formatted_SP.docx - as tables_formatted_all.docx, for responses from service providers (SP) only. DIAMAS_Tables_single.R - R script used to generate raw tables with aggregated data for all single response questions DIAMAS_Tables_multiple.R - R script used to generate raw tables with aggregated data for all multiple response questions DIAMAS_Tables_layout.R - R script used to generate document with formatted tables from raw tables with aggregated data DIAMAS Survey on Instititutional Publishing - data availability statement (pdf) All data are made available under a CC0 license. Additional ways survey data is made available IPSP dataset This dataset contains basic data of Institutional Publishing Service Providers (IPSPs) that will become part of an IPSP Registry in June 2025. It holds data about 651 organisations that responded to the survey and agreed to be listed in the IPSP registry. It contains names, contact information and information on countries, supported languages, parent organisations, IPSP type, services offered, output types, disciplinary and coverage. Downloadable in csv format from: https://doi.org/10.5281/zenodo.8269906.Description is at: https://doi.org/10.5281/zenodo.8305139. Licence: CC0. Custom aggregate tabular data The DIAMAS project is open to community requests for performing quantitative analyses not available in the landscape report or the country reports. Until August 2025, and to the extent that capacity is available, quantitative analyses will be carried out and resulting aggregate data openly shared, without disclosing who requested the analysis. Any results will be downloadable in csv and xlsx formats from the DIAMAS Zenodo Community at https://zenodo.org/communities/diamasproject/.Please contact contact@diamasproject.eu for more information.
https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
The procedures and the data files needed to recreate the figures in the manuscript
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Change in openness of data availability statements from preprint to published article, grouped by journal data-sharing policy.
Please contact the corresponding author, Shynar Dyussembayeva, for more information on the dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The file named Coulped Fe-P-S cycling in crab burrows is the original data for the paper Xiao et al "Crab bioturbation drives coupled iron-phosphate-sulfide cycling in mangrove and saltmarsh porewater".
The data underlying scientific papers should be accessible to researchers both now and in the future, but how best can we ensure that these data are available? Here we examine the effectiveness of four approaches to data archiving: no stated archiving policy, recommending (but not requiring) archiving, and two versions of mandating data deposition at acceptance. We control for differences between data types by trying to obtain data from papers that use a single, widespread population genetic analysis, STRUCTURE. At one extreme, we found that mandated data archiving policies that require the inclusion of a data availability statement in the manuscript improve the odds of finding the data online almost 1000-fold compared to having no policy. However, archiving rates at journals with less stringent policies were only very slightly higher than those with no policy at all. We also assessed the effectiveness of asking for data directly from authors and obtained over half of the requested data...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset underpins research undertaken by the Data Publishing team at Springer Nature which analysed the impact of Data Availability Statements on Nature journal editors, and how researchers choose to share their data.Mandatory Data Availability Statements were introduced by Nature journals in 2016 which require researchers to state how their data can be accessed.The dataset comprises of a single Excel file, which include the journal title, unique ID for each published article, subject areas, and the estimated time required to include a Data Availability Statement as reported by the journals' editorial staff. The median time per journal is also calculated.The full text of the Data Availability statement is included, and the statements are coded according to the data sharing method described.This dataset supports a paper that has been peer reviewed and accepted for presentation at the International Digital Curation Conference 2018. The paper has been submitted to the International Journal of Digital Curation. At the time of dataset release the full paper is available as a preprint in BioRxiv.