24 datasets found

s
Analysis of CBCS publications for Open Access, data availability statements...
figshare.scilifelab.se
researchdata.se
+2more
txt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Theresa Kieselbach (2025). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data [Dataset]. http://doi.org/10.17044/scilifelab.23641749.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.23641749.v1
Dataset updated
Jan 15, 2025
Dataset provided by
Umeå University
Authors
Theresa Kieselbach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)
Data Availability Statement.
figshare.com
docx
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olga Alcântara Barros (2021). Data Availability Statement. [Dataset]. http://doi.org/10.6084/m9.figshare.13951607.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13951607.v1
Dataset updated
Feb 12, 2021
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Olga Alcântara Barros
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
We analyzed all the samples using a stereomicroscope, Olympus C011 trinocular microscope, coupled with a CCD camera. All the samples were measured and photographed by the Infinity Capture software.The drawn was improved with a drawing table, Parblo A610 – Graphhic tablet using the program ImageJ (Public Dominic). The geographical location of the Araripe Basin was produced using the software QGIS Geographic Information System (version 3.12 – QGIS.org – Public Dominic) considering the coordinate system Datum – SIRGAS 200 from Instituto Brasileiro de Geografia e Estatística (IBGE, Brazil) and Companhia de Pesquisa de Recursos Minerais (CPRM, Brazil). The stratigraphy of the Santana group was drawn with program ImageJ (Public Dominic) to according with stratigraphy on Neumann & Cabreira, 1999 and Valença et al., 2003.
d
Replication data for: An Analysis of Data Availability Statements in...
search.dataone.org
dataverse.harvard.edu
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karcher, Sebastian; Robey, Derek; Kirilova, Dessislava; Weber, Nic (2025). Replication data for: An Analysis of Data Availability Statements in Qualitative Research Journal Articles [Dataset]. http://doi.org/10.7910/DVN/THG8MN
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/THG8MN
Dataset updated
Oct 29, 2025
Dataset provided by
Harvard Dataverse
Authors
Karcher, Sebastian; Robey, Derek; Kirilova, Dessislava; Weber, Nic
Description
Summary Over the past decade, many scholarly journals have adopted policies on data sharing, with an increasing number of journals requiring that authors share the data underlying their published work. Frequently, qualitative data are excluded from those policies explicitly or implicitly. A few journals, however, intentionally do not make such a distinction. This project focuses on articles published in eight of the open-access journals maintained by Public Library of Science (PLOS). All PLOS journals introduced strict data sharing guidelines in 2014, applying to all empirical data on the basis of which articles are published. We collected a database of more than 2,300 articles containing a qualitative data component published between January 1, 2015 and August 23, 2023 and analyzed the data availability statements (DAS) researchers made regarding the availability, or lack thereof, of their data. We describe the degree to which and manner in which data are reportedly available (for example, in repositories, via institutional gate-keepers, or on request from author) versus those that are declared to be unavailable We also outline several dimensions of patterned variation in the data availability statements, including describe temporal patterns and variation by data type. Based on the results, we also provide recommendations to both researchers on how to make their data availability statements clearer, more transparent and more informative, and to journal editors and reviewers, on how to interpret and evaluate statements to ensure they accurately reflect a given data availability scenario. Finally, we suggest a workflow which can link interactions with repositories most productively as part of a typical editorial process. Data Overview This data deposit includes data and code to assemble the dataset, generate all figures and values used in the paper and appendix, and generate the codebook. It also includes the codebook and the figures. The analysis.R script and the data in data/analysis are sufficient to reproduce all findings in the paper. The additional scripts and the data files in data/raw are included for full transparency and to facilitate the detection of any errors in the data processing pipeline. Their structure is due to the development of the project over time.
Data Availability Statements in the 2020 and 2021 scientific publications of...
zenodo.org
nde-dev.biothings.io
+2more
csv, pdf
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaisa Kylmälä; Kaisa Kylmälä; Tomi Toikko; Tomi Toikko (2024). Data Availability Statements in the 2020 and 2021 scientific publications of Tampere University [Dataset]. http://doi.org/10.5281/zenodo.7564441
Explore at:
pdf, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7564441
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kaisa Kylmälä; Kaisa Kylmälä; Tomi Toikko; Tomi Toikko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Tampere
Description
For this dataset, scientific peer-reviewed articles by Tampere University researchers from the years 2020 and 2021 were extracted from the TUNICRIS. A random sample of 40 percent was taken from the listed 4,922 publications according to faculties and years. There were 2,085 analyzed articles, i.e. more than 42 percent of the total number.

To find Data Availability Statements, articles were opened one by one and searched for mentions of research data and its availability. For each article, it was written down whether DAS existed and where in the article it was located. From the contents of DAS, information about data availability, location, openness and possible restrictions on use was written down.

Dataset also includes information about the journals and publications taken from TUNICRIS.

The prevalence of DAS and data openness were examined in relation to different variables. Tampere University faculty information has been removed from the dataset.

Related slides: https://doi.org/10.5281/zenodo.7655892

Related article (in Finnish): Toikko, T., & Kylmälä, K. (2023). Tutkimusdatan saatavuustiedot tieteellisissä artikkeleissa: Raportti Data Availability Statementien käytöstä Tampereen yliopistossa. Informaatiotutkimus, 42(1-2), 31–50. https://doi.org/10.23978/inf.126098
Data from: Data sharing in PLOS ONE: An analysis of Data Availability...
figshare.com
datasetcatalog.nlm.nih.gov
txt
Updated Feb 9, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Federer (2018). Data sharing in PLOS ONE: An analysis of Data Availability Statements [Dataset]. http://doi.org/10.6084/m9.figshare.5690878.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5690878.v1
Dataset updated
Feb 9, 2018
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Lisa Federer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains Data Availability Statements from 47,593 papers published in PLOS ONE between March 2014 (when the policy went into effect) and May 2016, analyzed for type of statement.
s
Analysis of publications of the Swedish Metabolomics Centre for Open Access...
figshare.scilifelab.se
researchdata.se
+1more
txt
Updated Sep 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Theresa Kieselbach (2025). Analysis of publications of the Swedish Metabolomics Centre for Open Access licenses, data availability statements and access to data [Dataset]. http://doi.org/10.17044/scilifelab.29392007.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.29392007.v1
Dataset updated
Sep 19, 2025
Dataset provided by
Umeå University
Authors
Theresa Kieselbach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Content and data sourceThis dataset contains the results of a manual analysis of Open Science markers in the publications of the Swedish Metabolomics Centre (SMC) between 2016 and 2024. It contains similar variables as the data of the "Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data" (Kieselbach, 2023).The sample of these publications was fetched from SciLifeLab on 5 May 2025 at the URL: https://publications.scilifelab.se/label/Swedish Metabolomics Centre (SMC)It contains 285 articles that are the source data for the work to create this dataset. Every publication was manually visited at its DOI URL and checked for 23 variables.Questions studiedSome of the questions that were addressed in the collection of the data are:1. Does the article have an open license and what kind of license does it have?2. Does the article contain research data that may have restricted access such as personal data and health data?3. Does the article contain a data availability statement?4. Does the article contain supplementary material that the authors added to it?5. Does the supplementary material contain research data?6. Does the supplementary material contain metabolomics data such as, for instance, summaries and visualizations?7. Did the authors submit metabolomics data to MetaboLights at the EBI or to other repsoitories?8. Did the authors submit other data to other repositories?9. Is data available on request from the authors?Visualization of dataThe data was compiled and visualized using Microsoft Excel 365. The visualization includes one table that gives a general overview of the dataset, and four figures that show some results of the analysis.Figure 1. Percentage of publications between 2016 and 2024 with an Open Access License and with a data availability statement.Figure 2. Submissions to repositories between 2016 and 2024.Figure 3. Percentage of publications that contained supplementary material and if this supplementary material contained research data and metabolomics data.Figure 4. Repositories used by the authors between 2016 and 2024.List of variables1. Year of Publication (answer: year)2. Date of Publication (answer: date)3. DOI (answer: DOI)4. DOI URL (answer: DOI URL)5. Research article (answer: Yes or No)6. Access to article without paywall (answer: Yes or No)7. License for research article (answer: Name of the license or No)8. Data with restricted access (answer: Yes or No)9. Data availability statement in article (answer: Yes or No)10. Supplementary material added to article (answer: Yes or No)11. Access to supplementary material without paywall (answer: Yes or No)12. Supplementary material contains research data (answer: Yes or No)13. Supplementary data contains metabolomics data (answer: Yes or No)14. Persistent identifier for supplementary data (answer: Yes or No)15. Source data added to the article (answer: Yes or No)16. Source data contain metabolomics data (answer: Yes or No)17. Authors submitted metabolomics data to MetaboLights (answer: Yes or No)18. Authors submitted metabolomics data to another repository (answer: name of the repository or No)19. Authors submitted other data to a repository (answer: name of the repository or No)20. Authors submitted other data to a second repository (answer: name of the repository or No)21. Authors submitted other data to a third repository (answer: name of the repository or No)22. Authors submitted code to a repository (answer: name of the repository or No)23. Data available on request from the authors (answer: Yes or No)Variables that are available in the source data1. Title of article2. Authors3. Journal4. Year5. (Date) Published6. (Date) E-published7. Volume8. Issue9. Pages10. DOI11. PMID12. Labels13. Qualifiers14. IUID15. URL16. DOI URL of research article17. PubMed URL of research articleFile formats and softwareThe file formats used in this dataset are:.csv (Text file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365.ReferenceKieselbach, Theresa (2023). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data. Umeå University. Dataset. https://doi.org/10.17044/scilifelab.23641749.v1AbbreviationsCC BY 4.0: Creative Commons Attribution 4.0 International Public LicenseCC BY-NC 4.0: Creative Commons Attribution-NonCommercial 4.0 International Public LicenseCC BY-NC 3.0: Creative Commons Attribution-NonCommercial 3.0 International Public LicenseCC BY-NC-ND 4.0: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public LicenseDOI: Digital Object IdentifierEBI: European Bioinformatics InstituteEBI-ArrayExpress: The ArrayExpress collection of functional genomics data at the EBIEBI-ENA: European Nucleotide Archive at the EBIEBI-Pride: Proteomics Identification Database at the EBIe!DAL: electronic Data Archive Library at the Leibniz Institute for Plant Genetics and Crop Plant ResearchIUID: Item Unique identificationLUDC: Lund University Diabetes CentreLUDC repository: data repository at the Lund University Diabetes CentreNCBI: National Center for Biotechnology InformationNCBI-GEO: The Gene Expression Omnibus database repository at the NCBINCBI-SRA: The Sequence Read Archive at the NCBIPMID: Pubmed IdentifierURL: Uniform Resource LocatorMD5 Checksums of the filesManifest.txt (2 KB): 89f32a728fb74ebecef0aef4633130b0README.txt (6 KB): 34ea4ad9cb9bdea54755fa87f2d0b913Analysis_SMC_publications_2016_2024_Open_Access_publication_and_access_to_data_status_2025_06_24.csv (46 KB): 9719df26381901bc6aabfd34fdbfab81Analysis_SMC_publications_2016_2024_Open_Access_publication_and_access_to_data_status_2025_06_24.xlsx (49 KB): 1ec95dc29262645240e7d8714967bcfcTable_1_Overview_SMC_publications_2016_2024_status_2025_06_11.csv (391 Bytes): 1fd723dc6f52f18251d41c0d343a4f0fTable_1_Overview_SMC_publications_2016_2024_status_2025_06_11.xlsx (9 KB): 38622a9681c6f1057a6e1a4be56b0285Figure_1_SMC_publications_2016_2024_open_access_license_and_data_availability_status_2025_06_11.csv (468 Bytes): 9f9156f8d52603ccdec968f626bc002aFigure_1_SMC_publications_2016_2024_open_access_license_and_data_availability_status_2025_06_11.jpg (119 KB): dc9a4d7de4c789e8aea46ce66e007301Figure_1_SMC_publications_2016_2024_open_access_license_and_data_availability_status_2025_06_11.xlsx (15 KB): 6527d1ebd0069ef3757bd1b049f0fc74Figure_2_SMC_publications_2016_2024_metabolomics_data_and_other_data_to_repositories_status_2024_06_12.csv (300 Bytes): 5abc4a0fcf776f8dc4745f41deddacbcFigure_2_SMC_publications_2016_2024_metabolomics_data_and_other_data_to_repositories_status_2024_06_12.jpg (126 KB): e03e5bf4ba2d942c3b022aebb0a59033Figure_2_SMC_publications_2016_2024_metabolomics_data_and_other_data_to_repositories_status_2024_06_12.xlsx (15 KB): a80f977c051d4798db221b07733c694bFigure_3_SMC_publications_2016_2024_overview_supplementary_data_status_2025_06_11.csv (670 Bytes): a694a3defa98aa52fcdec8ff9e9e3316Figure_3_SMC_publications_2016_2024_overview_supplementary_data_status_2025_06_11.jpg(153 KB): 3928bdc1f046ca9b6f66bdbcdf936ca8Figure_3_SMC_publications_2016_2024_overview_supplementary_data_status_2025_06_11.xlsx (15 KB): 46dfda56b116b571b4bf8e3674b44512Figure_4_SMC_publications_2016_2024_submission_of_data_to_repositories_status_2025_06_12.csv (498 Bytes): 8963a412cc9e458ced2e80883bb93e1aFigure_4_SMC_publications_2016_2024_submission_of_data_to_repositories_status_2025_06_12.jpg (137 KB): c9ba447225e99431f24732128a754b7eFigure_4_SMC_publications_2016_2024_submission_of_data_to_repositories_status_2025_06_12.xlsx (16 KB): 1e2813d3ccb0ee14991b276947c21b8aMaterials_and_methods_SMC_publications_2016_2024.docx (19 KB): 71776ffc1e530e1b40255763403b2f40Materials_and_methods_SMC_publications_2016_2024.txt (4 KB): 26c4b91b958b9e33d93d13dc52b25da9Materials_and_methods_SMC_publications_2026_2024.pdf (172 KB): eee564f452ef4f3cf57bb81a6874fcd4SMC_publications_2016_2024_status_2025_05_05.csv (143 KB): 5e61d09244ca90b1e5b057a7afdfe5e7SMC_publications_2016_2024_status_2025_05_05.xlsx (106 KB): 6977fbcac21ff5a12763e40de90c0a91
Description of coding categories and example statements.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa M. Federer; Christopher W. Belter; Douglas J. Joubert; Alicia Livinski; Ya-Ling Lu; Lissa N. Snyders; Holly Thompson (2023). Description of coding categories and example statements. [Dataset]. http://doi.org/10.1371/journal.pone.0194768.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0194768.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Lisa M. Federer; Christopher W. Belter; Douglas J. Joubert; Alicia Livinski; Ya-Ling Lu; Lissa N. Snyders; Holly Thompson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description of coding categories and example statements.
Dataset #1: Cross-sectional survey data
figshare.com
txt
Updated Jul 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Baimel (2023). Dataset #1: Cross-sectional survey data [Dataset]. http://doi.org/10.6084/m9.figshare.23708730.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23708730.v1
Dataset updated
Jul 19, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Adam Baimel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
N.B. This is not real data. Only here for an example for project templates.

Project Title: Add title here

Project Team: Add contact information for research project team members

Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.

Relevant publications/outputs: When available, add links to the related publications/outputs from this data.

Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.

Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?

Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.

Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.

List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.

Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).

Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14
Data Template for UKRN Research Indicators Pilot 4
figshare.com
xlsx
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mick Eadie; Valerie McCutcheon; Radoslaw Pajor; Laurian Williamson (2024). Data Template for UKRN Research Indicators Pilot 4 [Dataset]. http://doi.org/10.6084/m9.figshare.26165794.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26165794.v1
Dataset updated
Jul 3, 2024
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Mick Eadie; Valerie McCutcheon; Radoslaw Pajor; Laurian Williamson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the template for datasets analysed as part of United Kingdom Reproducability Network (UKRN) Research Indicators Project, pilot 4 - the prevalence and quality of data availability statements.
Dataset of a Study of Computational reproducibility of Jupyter notebooks...
zenodo.org
pdf, zip
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sheeba Samuel; Sheeba Samuel; Daniel Mietchen; Daniel Mietchen (2024). Dataset of a Study of Computational reproducibility of Jupyter notebooks from biomedical publications [Dataset]. http://doi.org/10.5281/zenodo.8226725
Explore at:
zip, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8226725
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sheeba Samuel; Sheeba Samuel; Daniel Mietchen; Daniel Mietchen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains the dataset for the study of computational reproducibility of Jupyter notebooks from biomedical publications. Our focus lies in evaluating the extent of reproducibility of Jupyter notebooks derived from GitHub repositories linked to publications present in the biomedical literature repository, PubMed Central. We analyzed the reproducibility of Jupyter notebooks from GitHub repositories associated with publications indexed in the biomedical literature repository PubMed Central. The dataset includes the metadata information of the journals, publications, the Github repositories mentioned in the publications and the notebooks present in the Github repositories.

Data Collection and Analysis

We use the code for reproducibility of Jupyter notebooks from the study done by Pimentel et al., 2019 and adapted the code from ReproduceMeGit. We provide code for collecting the publication metadata from PubMed Central using NCBI Entrez utilities via Biopython.

Our approach involves searching PMC using the esearch function for Jupyter notebooks using the query: ``(ipynb OR jupyter OR ipython) AND github''. We meticulously retrieve data in XML format, capturing essential details about journals and articles. By systematically scanning the entire article, encompassing the abstract, body, data availability statement, and supplementary materials, we extract GitHub links. Additionally, we mine repositories for key information such as dependency declarations found in files like requirements.txt, setup.py, and pipfile. Leveraging the GitHub API, we enrich our data by incorporating repository creation dates, update histories, pushes, and programming languages.

All the extracted information is stored in a SQLite database. After collecting and creating the database tables, we ran a pipeline to collect the Jupyter notebooks contained in the GitHub repositories based on the code from Pimentel et al., 2019.

Our reproducibility pipeline was started on 27 March 2023.

Repository Structure

Our repository is organized into two main folders:

archaeology: This directory hosts scripts designed to download, parse, and extract metadata from PubMed Central publications and associated repositories. There are 24 database tables created which store the information on articles, journals, authors, repositories, notebooks, cells, modules, executions, etc. in the db.sqlite database file.

analyses: Here, you will find notebooks instrumental in the in-depth analysis of data related to our study. The db.sqlite file generated by running the archaelogy folder is stored in the analyses folder for further analysis. The path can however be configured in the config.py file. There are two sets of notebooks: one set (naming pattern N[0-9]*.ipynb) is focused on examining data pertaining to repositories and notebooks, while the other set (PMC[0-9]*.ipynb) is for analyzing data associated with publications in PubMed Central, i.e.\ for plots involving data about articles, journals, publication dates or research fields. The resultant figures from the these notebooks are stored in the 'outputs' folder.

MethodsWorkflow: The MethodsWorkflow file provides a conceptual overview of the workflow used in this study.

Accessing Data and Resources:

All the data generated during the initial study can be accessed at https://doi.org/10.5281/zenodo.6802158

For the latest results and re-run data, refer to this link.

The comprehensive SQLite database that encapsulates all the study's extracted data is stored in the db.sqlite file.

The metadata in xml format extracted from PubMed Central which contains the information about the articles and journal can be accessed in pmc.xml file.

System Requirements:

Centos 7 (Documentation: https://www.centos.org/)

Conda 4.9.4 (Installation Guide: https://docs.anaconda.com/anaconda/install/linux/)

Python 3.7.6 (Download Link: https://www.python.org/downloads/)

GitHub account (Get Started: https://github.com/, Requires GitHub Username and Token)

gcc 7.3.0 (Installation Guide: https://gcc.gnu.org/install/)

lbzip2 (Command: `conda install -c conda-forge lbzip2')

Running the pipeline:

Clone the computational-reproducibility-pmc repository using Git:
git clone https://github.com/fusion-jena/computational-reproducibility-pmc.git

Navigate to the computational-reproducibility-pmc directory:
cd computational-reproducibility-pmc/computational-reproducibility-pmc

Configure environment variables in the config.py file:
GITHUB_USERNAME = os.environ.get("JUP_GITHUB_USERNAME", "add your github username here")
GITHUB_TOKEN = os.environ.get("JUP_GITHUB_PASSWORD", "add your github token here")

Other environment variables can also be set in the config.py file.
BASE_DIR = Path(os.environ.get("JUP_BASE_DIR", "./")).expanduser() # Add the path of directory where the GitHub repositories will be saved
DB_CONNECTION = os.environ.get("JUP_DB_CONNECTION", "sqlite:///db.sqlite") # Add the path where the database is stored.

To set up conda environments for each python versions, upgrade pip, install pipenv, and install the archaeology package in each environment, execute:
source conda-setup.sh

Change to the archaeology directory
cd archaeology

Activate conda environment. We used py36 to run the pipeline.
conda activate py36

Execute the main pipeline script (r0_main.py):
python r0_main.py

Running the analysis:

Navigate to the analysis directory.
cd analyses

Activate conda environment. We use raw38 for the analysis of the metadata collected in the study.
conda activate raw38

Install the required packages using the requirements.txt file.
pip install -r requirements.txt

Launch Jupyterlab
jupyter lab

Refer to the Index.ipynb notebook for the execution order and guidance.

References:

Sheeba Samuel, Daniel Mietchen. (2024). Computational reproducibility of Jupyter notebooks from biomedical publications, https://doi.org/10.1093/gigascience/giad113, GigaScience

Sheeba Samuel, Daniel Mietchen. (2022). Computational reproducibility of Jupyter notebooks from biomedical publications, https://arxiv.org/pdf/2209.04308.pdf, CoRR abs/2209.04308

Sheeba Samuel, & Daniel Mietchen. (2022). Dataset of a Study of Computational reproducibility of Jupyter notebooks from biomedical publications [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6802158
Table1_Data Availability of Open T-Cell Receptor Repertoire Data, a...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yu-Ning Huang; Naresh Amrat Patel; Jay Himanshu Mehta; Srishti Ginjala; Petter Brodin; Clive M. Gray; Yesha M. Patel; Lindsay G. Cowell; Amanda M. Burkhardt; Serghei Mangul (2023). Table1_Data Availability of Open T-Cell Receptor Repertoire Data, a Systematic Assessment.DOCX [Dataset]. http://doi.org/10.3389/fsysb.2022.918792.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fsysb.2022.918792.s001
Dataset updated
Jun 5, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Yu-Ning Huang; Naresh Amrat Patel; Jay Himanshu Mehta; Srishti Ginjala; Petter Brodin; Clive M. Gray; Yesha M. Patel; Lindsay G. Cowell; Amanda M. Burkhardt; Serghei Mangul
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Modern data-driven research has the power to promote novel biomedical discoveries through secondary analyses of raw data. Therefore, it is important to ensure data-driven research with great reproducibility and robustness for promoting a precise and accurate secondary analysis of the immunogenomics data. In scientific research, rigorous conduct in designing and conducting experiments is needed, specifically in scientific writing and reporting results. It is also crucial to make raw data available, discoverable, and well described or annotated in order to promote future re-analysis of the data. In order to assess the data availability of published T cell receptor (TCR) repertoire data, we examined 11,918 TCR-Seq samples corresponding to 134 TCR-Seq studies ranging from 2006 to 2022. Among the 134 studies, only 38.1% had publicly available raw TCR-Seq data shared in public repositories. We also found a statistically significant association between the presence of data availability statements and the increase in raw data availability (p = 0.014). Yet, 46.8% of studies with data availability statements failed to share the raw TCR-Seq data. There is a pressing need for the biomedical community to increase awareness of the importance of promoting raw data availability in scientific research and take immediate action to improve its raw data availability enabling cost-effective secondary analysis of existing immunogenomics data by the larger scientific community.
Z
DIAMAS survey on Institutional Publishing - aggregated data
nde-dev.biothings.io
data.niaid.nih.gov
+3more
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kramer, Bianca (2025). DIAMAS survey on Institutional Publishing - aggregated data [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_10590502
Explore at:
Dataset updated
Mar 13, 2025
Dataset provided by
Kramer, Bianca
Ross, George
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The DIAMAS project investigates Institutional Publishing Service Providers (IPSP) in the broadest sense, with a special focus on those publishing initiatives that do not charge fees to authors or readers. To collect information on Institutional Publishing in the ERA, a survey was conducted among IPSPs between March-May 2024. This dataset contains aggregated data from the 685 valid responses to the DIAMAS survey on Institutional Publishing.

The dataset supplements D2.3 Final IPSP landscape Report Institutional Publishing in the ERA: results from the DIAMAS survey.

The data

Basic aggregate tabular data

Full individual survey responses are not being shared to prevent the easy identification of respondents (in line with conditions set out in the survey questionnaire). This dataset contains full tables with aggregate data for all questions from the survey, with the exception of free-text responses, from all 685 survey respondents. This includes, per question, overall totals and percentages for the answers given as well the breakdown by both IPSP-types: institutional publishers (IPs) and service providers (SPs). Tables at country level have not been shared, as cell values often turned out to be too low to prevent potential identification of respondents. The data is available in csv and docx formats, with csv files grouped and packaged into ZIP files. Metadata describing data type, question type, as well as question response rate, is available in csv format. The R code used to generate the aggregate tables is made available as well.

Files included in this dataset

survey_questions_data_description.csv - metadata describing data type, question type, as well as question response rate per survey question.

tables_raw_all.zip - raw tables (csv format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. Zip file contains 180 csv files.

tables_raw_IP.zip - as tables_raw_all.zip, for responses from institutional publishers (IP) only. Zip file contains 180 csv files.

tables_raw_SP.zip - as tables_raw_all.zip, for responses from service providers (SP) only. Zip file contains 170 csv files.

tables_formatted_all.docx - formatted tables (docx format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option.

tables_formatted_IP.docx - as tables_formatted_all.docx, for responses from institutional publishers (IP) only.

tables_formatted_SP.docx - as tables_formatted_all.docx, for responses from service providers (SP) only.

DIAMAS_Tables_single.R - R script used to generate raw tables with aggregated data for all single response questions

DIAMAS_Tables_multiple.R - R script used to generate raw tables with aggregated data for all multiple response questions

DIAMAS_Tables_layout.R - R script used to generate document with formatted tables from raw tables with aggregated data

DIAMAS Survey on Instititutional Publishing - data availability statement (pdf)

All data are made available under a CC0 license.
3D VPIC Data - Laboratory verification of electron-scale diffusion regions...
zenodo.org
bin
Updated Mar 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Greess; Samuel Greess (2021). 3D VPIC Data - Laboratory verification of electron-scale diffusion regions modulated by a three-dimensional instability [Dataset]. http://doi.org/10.5281/zenodo.4556518
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4556518
Dataset updated
Mar 4, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Samuel Greess; Samuel Greess
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
3D Cylindrical VPIC data to partially satisfy the data availability requirement of JGR: Space Physics. See the full data availability statement for other data locations.

Each data file is in MATLAB .mat format and named for the toroidal (y) index to which the data files correspond. Included in each file is all the time-averaged simulation data needed to reconstruct 3D figures and the 3D Ohm's Law calculation. The input deck can be opened with a regular word processor (Notepad++ handles it well) and contains setup information about the simulation that can be used to recalculate the simulation size and parameters.
f
Scoring Criteria used for the assessment.
plos.figshare.com
xls
Updated Jul 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haya Deeb; Suzanna Creasey; Diego Lucini de Ugarte; George Strevens; Trisha Usman; Hwee Yun Wong; Megan A. M. Kutzer; Emma Wilson; Tomasz Zieliński; Andrew J. Millar (2025). Scoring Criteria used for the assessment. [Dataset]. http://doi.org/10.1371/journal.pone.0328065.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0328065.t001
Dataset updated
Jul 23, 2025
Dataset provided by
PLOS ONE
Authors
Haya Deeb; Suzanna Creasey; Diego Lucini de Ugarte; George Strevens; Trisha Usman; Hwee Yun Wong; Megan A. M. Kutzer; Emma Wilson; Tomasz Zieliński; Andrew J. Millar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Open science promotes the accessibility of scientific research and data, emphasising transparency, reproducibility, and collaboration. This study assesses the Openness and FAIR (Findable, Accessible, Interoperable, and Reusable) aspects of data-sharing practices within the biosciences at the University of Edinburgh from 2014 to 2023. We analysed 555 research papers across biotechnology, regenerative medicine, infectious diseases, and non-communicable diseases. Our scoring system evaluated data completeness, reusability, accessibility, and licensing, finding a progressive shift towards better data-sharing practices. The fraction of publications that share all relevant data increased significantly, from 7% in 2014 to 45% in 2023. Data involving genomic sequences were shared more frequently than image data or data on human subjects or samples. The presence of data availability statement (DAS) or preprint sharing correlated with more and better data sharing, particularly in terms of completeness. We discuss local and systemic factors underlying the current and future Open data sharing. Evaluating the automated ODDPub (Open Data Detection in Publications) tool on this manually-scored dataset demonstrated high specificity in identifying cases where no data was shared. ODDPub sensitivity improved with better documentation in the DAS. This positive trend highlights improvements in data-sharing, advocating for continued advances and addressing challenges with data types and documentation.
r
Referenzierung von Forschungsdatenpublikationen in RADAR
radar-service.eu
tar
Updated Mar 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dorothea Strecker (2025). Referenzierung von Forschungsdatenpublikationen in RADAR [Dataset]. http://doi.org/10.22000/fbhfgzy8d43r3tjw
Explore at:
tar(78336 bytes)Available download formats
Unique identifier
https://doi.org/10.22000/fbhfgzy8d43r3tjw
Dataset updated
Mar 21, 2025
Dataset provided by
Humboldt-Universität zu Berlin
Authors
Dorothea Strecker
Description
Description

This dataset describes how datasets published in the research data repository RADAR are referenced, combining references extracted from Google Scholar, DataCite Event Data and the Data Citation Corpus.

DOIs assigned to RADAR datasets were retrieved from the RADAR API 2025-01-27. References in the three data sources were then identified using these DOIs. Each research output referencing a RADAR dataset was accessed to determine where the reference occurred in the full text. Author names and publication dates for datasets and referencing objects were added from OpenAlex and DataCite on 2025-02-10. Author names of datasets and referencing objects were compared to determine if data reuse occurred.

Columns

from: DOI of the referencing object

to: DOI of the RADAR dataset

from_date: publication date of the referencing object

to_date: publication date of the RADAR dataset

source_gs: boolean indicating if the reference was found in Google Scholar

source_dcc: boolean indicating if the reference was found in the Data Citation Corpus

source_ded: boolean indicating if the reference was found in DataCite Event Data

method_rl: boolean indicating if the dataset was referenced in the reference list

method_das: boolean indicating if the dataset was referenced in the data availability statement

method_fn: boolean indicating if the dataset was referenced in a footnote

method_ft: boolean indicating if the dataset was referenced in other parts of the full text, for example in the methods section

reuse_author: variable indicating if the reference is indicating data (overlap in the author names of dataset and referencing object) use or data reuse (no overlap)
Z
Global suicide mortality rates (2000-2019) and bibliographic data
data.niaid.nih.gov
zenodo.org
Updated Jun 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranckeviciene, Erinija (2024). Global suicide mortality rates (2000-2019) and bibliographic data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12267301
Explore at:
Dataset updated
Jun 22, 2024
Dataset provided by
Vytautas Magnus University
Authors
Pranckeviciene, Erinija
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains World Bank Suicide mortality rate WDI (world development indicator) (2000-2019) world-wide data in original and processed form. In addition to the statistical data this dataset also contains bibliographic records of articles published on the topic of suicide in relation to individual countries during (2000-2019) in original and processed form.

The data consists of six archives:

World development indicator suicide mortality rate SH.STA.SUIC.P5. This archive contains suicide mortality rate of 159 countries during the period of 2000-2019 per 100,000 population including males and females as of November, 2023.

Web of science records country and suicide. This archive contains bibliographic records organized by country on the topic of suicide related to that country published during 2000-2019 as of November, 2023.

Suicide mortality rate statistics and keywords. This archive contains processed data of 1 and 2 archives in three files. The 'Countries suicide rates and WOS records' contains organized temporal suicide mortality rate data for each country and each year for males and females including counts of articles on suicide related in that country. The 'words and countries matrix' file contains information about how many times author and paper keywords from suicide related publications were seen in articles associated with each country. This data is organized as matrix in which rows are keywords, columns are countries and cells are counts of the keyword. The 'words and countries pairs' file contains same information only organized as keyword country pairs.

Suicide mortality rate clusters countries keywords titles. This archive contains bibliographic data organized by country clusters. These clusters group countries with similar suicide mortality rate dynamics in males and females shown in two included figures. Each folder of the cluster contains a section with bibliographic records; a section with keywords associated with each country; and a section in which each publication associated with the country has a separate filecontaining its title and keywords.

Suicide keywords embedding data. This archive contains word embedding vectors and metadata learned by recurrent neural network trained to classify countries from suicide related keywords of articles associated with those countries. Folder 'trained with keywords' contains embeddings learned in classifying countries in which training samples are keyword strings of publications. Folder 'trained with titles' contains embeddings learned in classifying countries in which training samples are strings containing titles of publication plus keywords.

Suicide keywords association rule mining. This archive contains files of subsets of keywords frequently mentioned together in suicide related publications. Folder 'Mining in clusters' has frequent keyword itemsets in country clusters. Folder 'Mining in individual countries' has frequent keyword itemsets in countries. Examples of keyword networks connecting clusters and networks connecting countries in individual clusters are included which helps to identify specific and shared keywords by country clusters and by countries in the individual clusters.

These datasets support a data availability statements for upcoming articles.
f
Notebooks and Boolean networks for reproducing binarisation case study and...
datasetcatalog.nlm.nih.gov
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zinovyev, Andrei; Calzone, Laurence; Magaña-López, Gustavo; Paulevé, Loïc (2024). Notebooks and Boolean networks for reproducing binarisation case study and synthetic data generation. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001289525
Explore at:
Dataset updated
Jul 8, 2024
Authors
Zinovyev, Andrei; Calzone, Laurence; Magaña-López, Gustavo; Paulevé, Loïc
Description
The notebooks are provided as static HTML files, and Boolean networks as textual files in BoolNet format. See the Data availability statement for links to executable notebooks and code. (ZIP)
Open Science Indicators for a corpus of 8,131 research articles published by...
figshare.com
xlsx
Updated Oct 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rebecca Taylor-Grant; Eilise Norris (2025). Open Science Indicators for a corpus of 8,131 research articles published by Taylor & Francis journals [Dataset]. http://doi.org/10.6084/m9.figshare.30316342.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30316342.v1
Dataset updated
Oct 22, 2025
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Rebecca Taylor-Grant; Eilise Norris
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset represents a set of Open Science Indicators generated by the AI solution provider DataSeer using a corpus of 8,131 research articles published in Taylor & Francis journals in 2023.The corpus was selected using purposive sampling of journal titles to ensure inclusion of Open Access and Open Select (hybrid) journals; journals with a variety of data sharing policies; and journals representing a range of disciplines including life sciences, medicine and health, earth sciences, social sciences and psychology. From each journal in the corpus a random representative sample of between approximately 10-50 articles was selected and the full 2023 published output of any single journal is not included.The DataSeer analysis identified Open Science indicators including:- Presence of data availability statements;- Evidence of data sharing (via supplementary files or data repositories);- Evidence of code sharing;- Evidence of preprinting;- Pre-registration of studies;- Use of persistent identifiers (ORCIDs and RRIDs).As the dataset was generated by an AI tool, some errors or inaccuracies may be present. Before sharing the dataset publicly, the project team at Taylor & Francis undertook data cleansing to ensure that the dataset is comprehensible to an external audience and to enhance its reusability. Notes on data cleansing are included in the README file in the dataset spreadsheet, along with explanations of columns headers where needed.
f
This is the Data Availability Statement.
figshare.com
xlsx
Updated Jun 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chilot Kassa Mekonnen; Hailemichael Kindie Abate; Abere Woretaw Azagew; Muluken Chanie Agimas (2025). This is the Data Availability Statement. [Dataset]. http://doi.org/10.1371/journal.pone.0324363.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0324363.s002
Dataset updated
Jun 3, 2025
Dataset provided by
PLOS ONE
Authors
Chilot Kassa Mekonnen; Hailemichael Kindie Abate; Abere Woretaw Azagew; Muluken Chanie Agimas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionEpilepsy is a common non-communicable neurological disorder associated with recurrent seeding of cerebral neurons or brain cells and episodes of unprovoked seizures with or without loss of consciousness. Although there are studies on the health-related quality of life of epilepsy patients in Ethiopia, there are remarkable variations in the estimates of health-related quality of life.ObjectivesThis systematic review and meta-analysis aimed to determine the pooled effect size of the health-related quality of life of adult epilepsy patients in Ethiopia.MethodsOriginal articles about the health-related quality of life among epilepsy patients in Ethiopia were searched through known and international databases (PubMed, Scopus, and Web of Science) and search engines (Google and Google Scholar). Data were extracted using a standard data extraction checklist developed according to Joanna Briggs Institute (JBI). The I2 statistics were used to identify heterogeneity across studies. Funnel plot asymmetry and Egger’s tests were used to check for publication bias. The STATA version 11 software was employed for statistical analysis to pool the mean scores of health-related quality-of-life.ResultA total of 16 cross-sectional studies with a sample size of 5294 took part. The pooled overall mean score of health-related quality of life among epilepsy patients in Ethiopia was 52.82 ± 13.24 [95%CI (46.41, 59.21)], I2 = 100%, p-value
Journals code.
plos.figshare.com
xlsx
Updated Sep 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pinge Zhao; Xin Zhang; Liandi Dai; Baoguo Ma; Yuting Duan; Yan Xu; Hongmei Wei; Shengwei Wu; Linghui Xiong (2025). Journals code. [Dataset]. http://doi.org/10.1371/journal.pone.0331697.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0331697.s001
Dataset updated
Sep 2, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Pinge Zhao; Xin Zhang; Liandi Dai; Baoguo Ma; Yuting Duan; Yan Xu; Hongmei Wei; Shengwei Wu; Linghui Xiong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Responsible data sharing in clinical research can enhance the transparency and reproducibility of research evidence, thereby increasing the overall value of research. Since 2024, more than 5,000 journals have adhered to the International Committee of Medical Journal Editors (ICMJE) Data Sharing Statement (DSS) to promote data sharing. However, due to the significant effort required for data sharing and the scarcity of academic rewards, data availability in clinical research remains suboptimal. This study aims to explore the impact of biomedical journal policies and available supporting information on the implementation of data availability in clinical research publications This cross-sectional study will select 303 journals and their latest publications as samples from the biomedical journals listed in the Web of Science Journal Citation Reports based on stratified random sampling according to the 2023 Journal Impact Factor (JIF). Two researchers will independently extract journal data-sharing policies from the submission guidelines of eligible journals and data-sharing details from publications using a pre-designed form from Apr 2025 to Dec 2025. The data sharing levels of publications will be based on the openness of the data-sharing mechanism. Binomial logistic regression analyses will be used to identify potential journal factors that affect publication data-sharing levels. This protocol has been registered in Open Science Framework (OSF) Registries: https://doi.org/10.17605/OSF.IO/EX6DV.

Facebook

Twitter

Click to copy link

Link copied

Cite

Theresa Kieselbach (2025). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data [Dataset]. http://doi.org/10.17044/scilifelab.23641749.v1

Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.17044/scilifelab.23641749.v1

Dataset updated

Jan 15, 2025

Dataset provided by

Umeå University

Authors

Theresa Kieselbach

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)

Clear search

Close search

Google apps

Main menu

Analysis of CBCS publications for Open Access, data availability statements...

Data Availability Statement.

Replication data for: An Analysis of Data Availability Statements in...

Data Availability Statements in the 2020 and 2021 scientific publications of...

Data from: Data sharing in PLOS ONE: An analysis of Data Availability...

Analysis of publications of the Swedish Metabolomics Centre for Open Access...

Description of coding categories and example statements.

Dataset #1: Cross-sectional survey data

Data Template for UKRN Research Indicators Pilot 4

Dataset of a Study of Computational reproducibility of Jupyter notebooks...

Table1_Data Availability of Open T-Cell Receptor Repertoire Data, a...

DIAMAS survey on Institutional Publishing - aggregated data

3D VPIC Data - Laboratory verification of electron-scale diffusion regions...

Scoring Criteria used for the assessment.

Referenzierung von Forschungsdatenpublikationen in RADAR

Description

Columns

Global suicide mortality rates (2000-2019) and bibliographic data

Notebooks and Boolean networks for reproducing binarisation case study and...

Open Science Indicators for a corpus of 8,131 research articles published by...

This is the Data Availability Statement.

Journals code.

Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data