A data availability statement (DAS) is part of a research manuscript that contains information about where the raw data from the study can be accessed. Many journals do not require authors to write a DAS, and then most authors will not include such a statement [1]. In journals that require authors to write DAS in their manuscripts, most authors write in the DAS that their data is available on request from the corresponding authors. However, it has been shared that the overwhelming majority of those corresponding authors do not even respond to a message with the data request, and very few share their data [2].
Other than genuinely not wanting to share their data, other potential reasons for not even answering such messages are that the message requesting data ended up in a spam folder, that they are too busy, or that other team member(s) hold the raw data.
The aim of this study is to assess whether more raw data can be accessed if the data sharing request is sent to all authors versus only requesting data from the corresponding author.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset underpins research undertaken by the Data Publishing team at Springer Nature which analysed the impact of Data Availability Statements on Nature journal editors, and how researchers choose to share their data.Mandatory Data Availability Statements were introduced by Nature journals in 2016 which require researchers to state how their data can be accessed.The dataset comprises of a single Excel file, which include the journal title, unique ID for each published article, subject areas, and the estimated time required to include a Data Availability Statement as reported by the journals' editorial staff. The median time per journal is also calculated.The full text of the Data Availability statement is included, and the statements are coded according to the data sharing method described.This dataset supports a paper that has been peer reviewed and accepted for presentation at the International Digital Curation Conference 2018. The paper has been submitted to the International Journal of Digital Curation. At the time of dataset release the full paper is available as a preprint in BioRxiv.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and code associated with "The Observed Availability of Data and Code in Earth Science
and Artificial Intelligence" by Erin A. Jones, Brandon McClung, Hadi Fawad, and Amy McGovern.
Instructions: To reproduce figures, download all associated Python and CSV files and place
in a single directory.
Run BAMS_plot.py as you would run Python code on your system.
Code:
BAMS_plot.py: Python code for categorizing data availability statements based on given data
documented below and creating figures 1-3.
Code was originally developed for Python 3.11.7 and run in the Spyder
(version 5.4.3) IDE.
Libraries utilized:
numpy (version 1.26.4)
pandas (version 2.1.4)
matplotlib (version 3.8.0)
For additional documentation, please see code file.
Data:
ASDC_AIES.csv: CSV file containing relevant availability statement data for Artificial
Intelligence for the Earth Systems (AIES)
ASDC_AI_in_Geo.csv: CSV file containing relevant availability statement data for Artificial
Intelligence in Geosciences (AI in Geo.)
ASDC_AIJ.csv: CSV file containing relevant availability statement data for Artificial
Intelligence (AIJ)
ASDC_MWR.csv: CSV file containing relevant availability statement data for Monthly
Weather Review (MWR)
Data documentation:
All CSV files contain the same format of information for each journal. The CSV files above are
needed for the BAMS_plot.py code attached.
Records were analyzed based on the criteria below.
Records:
1) Title of paper
The title of the examined journal article.
2) Article DOI (or URL)
A link to the examined journal article. For AIES, AI in Geo., MWR, the DOI is
generally given. For AIJ, the URL is given.
3) Journal name
The name of the journal where the examined article is published. Either a full
journal name (e.g., Monthly Weather Review), or the acronym used in the
associated paper (e.g., AIES) is used.
4) Year of publication
The year the article was posted online/in print.
5) Is there an ASDC?
If the article contains an availability statement in any form, "yes" is
recorded. Otherwise, "no" is recorded.
6) Justification for non-open data?
If an availability statement contains some justification for why data is not
openly available, the justification is summarized and recorded as one of the
following options: 1) Dataset too large, 2) Licensing/Proprietary, 3) Can be
obtained from other entities, 4) Sensitive information, 5) Available at later
date. If the statement indicates any data is not openly available and no
justification is provided, or if no statement is provided is provided "None"
is recorded. If the statement indicates openly available data or no data
produced, "N/A" is recorded.
7) All data available
If there is an availability statement and data is produced, "y" is recorded
if means to access data associated with the article are given and there is no
indication that any data is not openly available; "n" is recorded if no means
to access data are given or there is some indication that some or all data is
not openly available. If there is no availability statement or no data is
produced, the record is left blank.
8) At least some data available
If there is an availability statement and data is produced, "y" is recorded
if any means to access data associated with the article are given; "n" is
recorded if no means to access data are given. If there is no availability
statement or no data is produced, the record is left blank.
9) All code available
If there is an availability statement and data is produced, "y" is recorded
if means to access code associated with the article are given and there is no
indication that any code is not openly available; "n" is recorded if no means
to access code are given or there is some indication that some or all code is
not openly available. If there is no availability statement or no data is
produced, the record is left blank.
10) At least some code available
If there is an availability statement and data is produced, "y" is recorded
if any means to access code associated with the article are given; "n" is
recorded if no means to access code are given. If there is no availability
statement or no data is produced, the record is left blank.
11) All data available upon request
If there is an availability statement indicating data is produced and no data
is openly available, "y" is recorded if any data is available upon request to
the authors of the examined journal article (not a request to any other
entity); "n" is recorded if no data is available upon request to the authors
of the examined journal article. If there is no availability statement, any
data is openly available, or no data is produced, the record is left blank.
12) At least some data available upon request
If there is an availability statement indicating data is produced and not all
data is openly available, "y" is recorded if all data is available upon
request to the authors of the examined journal article (not a request to any
other entity); "n" is recorded if not all data is available upon request to
the authors of the examined journal article. If there is no availability
statement, all data is openly available, or no data is produced, the record
is left blank.
13) no data produced
If there is an availability statement that indicates that no data was
produced for the examined journal article, "y" is recorded. Otherwise, the
record is left blank.
14) links work
If the availability statement contains one or more links to a data or code
repository, "y" is recorded if all links work; "n" is recorded if one or more
links do not work. If there is no availability statement or the statement
does not contain any links to a data or code repository, the record is left
blank.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data supporting the Springer Nature Data Availability Statement (DAS) analysis in the State of Open Data 2024. SOOD_2024_special_analysis_DAS_SN.xlsx contains the DAS, DOI, publication date, DAS categories and related country by Insitution of any author.SOOD 2024_DAS_analysis_sharing.xlsx contains the summary data by country and data sharing type.Utilizing the Dimensions database, we identified articles containing key DAS identifiers such as “Data Availability Statement” or “Availability of Data and Materials” within their full text. Digital Object Identifiers (DOIs) of these articles were collected and matched against Springer Nature’s XML database to extract the DAS for each article. The extracted DAS were categorized into specific sharing types using text and data matching terms. For statements indicating that data are publicly available in a repository, we matched against a predefined list of repository identifiers, names, and URLs. The DAS were classified into the following categories:1. Data are available from the author on request. 2. Data are included in the manuscript or its supplementary material. 3. Some or all of the data are publicly available, for example in a repository.4. Figure source data are included with the manuscript. 5. Data availability is not applicable.6. Data are declared as not available by the author.7. Data available online but not in a repository.These categories are non-exclusive: more than one can apply to any one article. Publications outside the 2019–2023 range and non-article publication types (e.g., book chapters) that were initially included in the Dimensions search results were excluded from the final dataset. Articles were included in the final analysis after applying the exclusion criteria. Upon processing, it was found that only 370 results were returned for Botswana across the five-year period; due to this low number, Botswana was not included in the DAS focused country-level analysis. This analysis does not assess the accuracy of the DAS in the context of each individual article. There was no manual verification of the categories applied; as a result, terms used out of context could have led to misclassification. Approximately 5% of articles remained unclassified following text and data matching due to these limitations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThere is increasing interest to make primary data from published research publicly available. We aimed to assess the current status of making research data available in highly-cited journals across the scientific literature. Methods and ResultsWe reviewed the first 10 original research papers of 2009 published in the 50 original research journals with the highest impact factor. For each journal we documented the policies related to public availability and sharing of data. Of the 50 journals, 44 (88%) had a statement in their instructions to authors related to public availability and sharing of data. However, there was wide variation in journal requirements, ranging from requiring the sharing of all primary data related to the research to just including a statement in the published manuscript that data can be available on request. Of the 500 assessed papers, 149 (30%) were not subject to any data availability policy. Of the remaining 351 papers that were covered by some data availability policy, 208 papers (59%) did not fully adhere to the data availability instructions of the journals they were published in, most commonly (73%) by not publicly depositing microarray data. The other 143 papers that adhered to the data availability instructions did so by publicly depositing only the specific data type as required, making a statement of willingness to share, or actually sharing all the primary data. Overall, only 47 papers (9%) deposited full primary raw data online. None of the 149 papers not subject to data availability policies made their full primary data publicly available. ConclusionA substantial proportion of original research papers published in high-impact journals are either not subject to any data availability policies, or do not adhere to the data availability instructions in their respective journals. This empiric evaluation highlights opportunities for improvement.
Please contact the corresponding author, Shynar Dyussembayeva, for more information on the dataset.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The data underlying scientific papers should be accessible to researchers both now and in the future, but how best can we ensure that these data are available? Here we examine the effectiveness of four approaches to data archiving: no stated archiving policy, recommending (but not requiring) archiving, and two versions of mandating data deposition at acceptance. We control for differences between data types by trying to obtain data from papers that use a single, widespread population genetic analysis, STRUCTURE. At one extreme, we found that mandated data archiving policies that require the inclusion of a data availability statement in the manuscript improve the odds of finding the data online almost 1000-fold compared to having no policy. However, archiving rates at journals with less stringent policies were only very slightly higher than those with no policy at all. We also assessed the effectiveness of asking for data directly from authors and obtained over half of the requested datasets, albeit with ∼8 d delay and some disagreement with authors. Given the long-term benefits of data accessibility to the academic community, we believe that journal-based mandatory data archiving policies and mandatory data availability statements should be more widely adopted.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Working with Adult-Child Survivors of Severe Parental Alienation Abuse: Survivors and Mental Health Practitioners Perspectives. A qualitative study.
By Alyse Price-Tobler
The PhD thesis from USC Queensland is currently under embargo for one year, as directed by the Dean of Graduate Studies. However, the data notes associated with the thesis are accessible in the UniSC Research Bank and F1000Research, an open-access publishing platform affiliated with the Taylor & Francis Group.
Regarding journal requests concerning the availability of data sets from the embargoed thesis for publication purposes, the following details are provided:
Abstract-
These two studies present new insights into the perspectives of adult-child survivors of severe parental alienation (SPA) and their mental health practitioners (MHPs) in addressing the complex challenges of SPA support and treatment. The term adult-child survivor refers to a child who has grown up and been exposed to SPA by one of their parents. Parental alienation (PA) refers to a process in which one parent (referring to the alienating parent or AP) takes actions to negatively impact the relationship between a child and their other parent (known as the targeted parent or TP) (Haines et al., 2020, p. 3). This concept is defined as the alienating parents’ behaviours influencing the child to reject the TP without a reasonable explanation (Haines et al., 2020).
The issue of childhood exposure to severe levels of parental alienating behaviours (PABs) is a prevalent and serious problem that can have long-lasting adverse effects on adult survivors. However, MHPs are often ill-equipped to work with adult survivors due to limited professional development and the lack of established best-practice treatment protocols for practitioners to reference. Therefore, this research has investigated survivors' perspectives and the MHPs who work with them regarding therapeutic practice. In particular, these two studies focus on identifying efficacious and counterproductive mental health practices.
These two studies utilised a research methodology involving a social constructionist thematic analysis approach and a qualitative research design. In addition, semi-structured interviews were conducted to collect data from eleven adult survivors of SPA and ten MHPs who were self-acknowledged as experts in treating adult survivors of parental alienation-severe level.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository for the manuscript: Kuppe, Ibrahim et al. "Decoding myofibroblast origins in human kidney fibrosis", 2020. Please also consult the supplemental data in the paper, and the data availability statement in hte manuscript for raw FASTQ files for mouse data.
For further data requests and questions, please contact Dr. Rafael Kramann (rkramann@ukaachen.de)
File Details:
- Human in vitro PDGFRb+ RNA-seq (bulk RNA-seq data for various NKD2 knock-out and knock-in clones)
* invitro_bulk_rnaseq.tar.gz: Salmon output for all samples. Please see the manuscript for further information.
- UUO Mouse FACS sorted PDGFRa+/b+ ATAC-Seq
* mouse_uuo_pdgfrab_atacseq.bw: BigWig Signal file for ATAC-Seq data, PDGFRa+/b+ FACS sorted cells from day 10 UUO mouse kidneys (average of two biological replicates)
* mouse_uuo_pdgfrab_motifs.meme: Motifs identified based on the ATAC-Seq data and further analyzed in the paper
- UUO and Sham Mouse FACS sorted PDGFRa+/b+ scRNA-seq (10x Genomics)
* Mouse_PDGFRab.tar.gz: contains the count data derived by Alevin/Salmon for the cells analyzed in the paper in matrix market format (.mtx). column data include cell cluster annotations.
- UUO and Sham Mouse FACS sorted PDGFRb+ scRNA-seq (SmartSeq2)
* Mouse_PDGFRa.tar.gz: contains the expression data for the cells analyzed in the paper in matrix market format (.mtx). column data include cell cluster annotations.
- Human FACS sorted CD10+ scRNA-seq (10x Genomics)
* Human_CD10plus.tar.gz: contains the count data derived by Alevin/Salmon for the cells analyzed in the paper in matrix market format (.mtx). column data include cell cluster annotations.
- Human FACS sorted CD10- scRNA-seq (10x Genomics)
* Human_CD10minus.tar.gz: contains the count data derived by Alevin/Salmon for the cells analyzed in the paper in matrix market format (.mtx). column data include cell cluster annotations.
- Human FACS sorted PDGFRb+ scRNA-seq (10x Genomics)
* Human_PDGFRb.tar.gz: contains the count data derived by Alevin/Salmon for the cells analyzed in the paper in matrix market format (.mtx). column data include cell cluster annotations.
* HumanPDGFRBpositive_Nkd2_grnboost2.csv: Gene Regulatory Network obtained by GRNboost2 on genes correlated with NKD2 in Fibroblast (Mesenchymal) cells. See manuscript for details.
* Human_PDGFRBplus_TFanalysis.tar.gz: TF analysis based on single cell RNA-seq for promoter and distal regions. See manuscript for details.
- github_files.tar.gz: RData Objects associated with the paper code repository (https://github.com/mahmoudibrahim/KidneyMap)
Attribution-NonCommercial-ShareAlike 2.5 (CC BY-NC-SA 2.5)https://creativecommons.org/licenses/by-nc-sa/2.5/
License information was derived automatically
El recurso consiste en una serie de datos experimentales obtenidos en el contexto del trabajo de investigacion recientemente publicado en ICF-RSC, y constituye el soporte para consulta publica que solicita la editorial en su Data Availability Statement (DAS) The resource consists of a series of experimental data obtained in the context of the research work recently published in ICF-RSC, and serves as the public consultation support requested by the publisher in its Data Availability Statement (DAS).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
A data availability statement (DAS) is part of a research manuscript that contains information about where the raw data from the study can be accessed. Many journals do not require authors to write a DAS, and then most authors will not include such a statement [1]. In journals that require authors to write DAS in their manuscripts, most authors write in the DAS that their data is available on request from the corresponding authors. However, it has been shared that the overwhelming majority of those corresponding authors do not even respond to a message with the data request, and very few share their data [2].
Other than genuinely not wanting to share their data, other potential reasons for not even answering such messages are that the message requesting data ended up in a spam folder, that they are too busy, or that other team member(s) hold the raw data.
The aim of this study is to assess whether more raw data can be accessed if the data sharing request is sent to all authors versus only requesting data from the corresponding author.