100+ datasets found

n
Data from: Data reuse and the open data citation advantage
data.niaid.nih.gov
search.dataone.org
+2more
zip
Updated Oct 1, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heather A. Piwowar; Todd J. Vision (2013). Data reuse and the open data citation advantage [Dataset]. http://doi.org/10.5061/dryad.781pv
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.781pv
Dataset updated
Oct 1, 2013
Dataset provided by
National Evolutionary Synthesis Center
Authors
Heather A. Piwowar; Todd J. Vision
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results: Here, we look at citation rates while controlling for many known citation predictors, and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered.We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.
OpenCitations Index N-Triples dataset of all the citation data
figshare.com
zip
Updated Mar 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenCitations (2025). OpenCitations Index N-Triples dataset of all the citation data [Dataset]. http://doi.org/10.6084/m9.figshare.24369136.v4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24369136.v4
Dataset updated
Mar 27, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
OpenCitations
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains all the citation data (in N-Triples format) included in the OpenCitations Index, released on March 24, 2025. In particular, any citation in the dataset, defined as an individual of the class cito:Citation, includes the following information:[citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the final part of the URL identifying the citation (https://w3id.org/oc/index/ci/[OCI]);[property "cito:hasCitingEntity"] the citing entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitedEntity"] the cited entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitationCreationDate"] the creation date of the citation (i.e. the publication date of the citing entity);[property "cito:hasCitationTimeSpan"] the time span of the citation (i.e. the interval between the publication date of the cited entity and the publication date of the citing entity);[type "cito:JournalSelfCitation"] it records whether the citation is a journal self-citations (i.e. the citing and the cited entities are published in the same journal);[type "cito:AuthorSelfCitation"] it records whether the citation is an author self-citation (i.e. the citing and the cited entities have at least one author in common).Note: the information for each citation is sourced from OpenCitations Meta (https://opencitations.net/meta), a database that stores and delivers bibliographic metadata for all bibliographic resources included in the OpenCitations Indexes. The data provided in this dump is therefore based on the state of OpenCitations Meta at the time this collection was generated.This version of the dataset contains:2,155,497,918 citationsThe size of the zipped archive is 80.6 GB, while the size of the unzipped N-Triples files is 1.9 TB.
n
Data from: National citation patterns of NEJM, The Lancet, JAMA and The BMJ...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Sep 25, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gonzalo Casino; Roser Rius; Erik Cobo (2017). National citation patterns of NEJM, The Lancet, JAMA and The BMJ in the lay press: a quantitative content analysis [Dataset]. http://doi.org/10.5061/dryad.bh576
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.bh576
Dataset updated
Sep 25, 2017
Dataset provided by
Department of Communications and the Arts
Department of Statistics and Operations Research
Authors
Gonzalo Casino; Roser Rius; Erik Cobo
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Objectives: To analyse the total number of newspaper articles citing the four leading general medical journals and to describe national citation patterns. Design: Quantitative content analysis Setting/sample: Full text of 22 general newspapers in 14 countries over the period 2008-2015, collected from LexisNexis. The 14 countries have been categorized into four regions: US, UK, Western World (EU countries other than UK, and Australia, New Zealand and Canada) and Rest of the World (other countries). Main outcome measure: Press citations of four medical journals (two American: NEJM and JAMA; and two British: The Lancet and The BMJ) in 22 newspapers. Results: British and American newspapers cited some of the four analysed medical journals about three times a week in 2008-2015 (weekly mean 3.2 and 2.7 citations respectively); the newspapers from other Western countries did so about once a week (weekly mean 1.1), and those from the Rest of the World cited them about once a month (monthly mean 1.1). The New York Times cited above all other newspapers (weekly mean 4.7). The analysis showed the existence of three national citation patterns in the daily press: American newspapers cited mostly American journals (70.0% of citations), British newspapers cited mostly British journals (86.5%), and the rest of the analysed press cited more British journals than American ones. The Lancet was the most cited journal in the press of almost all Western countries outside the US and the UK. Multivariate correspondence analysis confirmed the national patterns and showed that over 85% of the citation data variability is retained in just one single new variable: the national dimension. Conclusion: British and American newspapers are the ones that cite the four analysed medical journals more often, showing a domestic preference for their respective national journals; non-British and non-American newspapers show a common international citation pattern.
e
The major statistical data of natural referencing
data.europa.eu
html
Updated Mar 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jack Hone (2024). The major statistical data of natural referencing [Dataset]. https://data.europa.eu/data/datasets/65f594ba5cf5f141524928b6
Explore at:
htmlAvailable download formats
Dataset updated
Mar 16, 2024
Dataset authored and provided by
Jack Hone
Description
This dataset gathers the most crucial SEO statistics for the year, providing an overview of the dominant trends and best practices in the field of search engine optimization. Aimed at digital marketing professionals, site owners, and SEO analysts, this collection of information serves as a guide to navigate the evolving SEO landscape with confidence and accuracy.

Mode of Data Production:

The statistics have been carefully selected and compiled from a variety of credible and recognized sources in the SEO industry, including research reports, web traffic data analytics, and consumer and marketing professional surveys. Each statistic was checked for reliability and relevance to current trends.

Categories Included: User search behaviour: Statistics on the evolution of search modes, including voice and mobile search. Mobile Optimisation: Data on the importance of site optimization for mobile devices. Importance of Backlinks: Insights on the role of backlinks in SEO ranking and the need to prioritize quality. Content quality: Statistics highlighting the importance of relevant and engaging content for SEO. Search engine algorithms: Information on the impact of algorithm updates on SEO strategies.

Usefulness of the Data: This dataset is designed to help users quickly understand current SEO dynamics and apply that knowledge in optimizing their digital marketing strategies. It provides a solid foundation for benchmarking, strategic planning, and informed decision-making in the field of SEO.

Update and Accessibility: To ensure relevance and timeliness, the dataset will be regularly updated with new information and emerging trends in the SEO world.
S
ESSD Data Citation Statistics (2014-2023)
scidb.cn
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lili ZHANG; Jiayi Hui; Ruilin LIU; Yike HU (2025). ESSD Data Citation Statistics (2014-2023) [Dataset]. http://doi.org/10.57760/sciencedb.12736
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.12736
Dataset updated
Mar 6, 2025
Dataset provided by
Science Data Bank
Authors
Lili ZHANG; Jiayi Hui; Ruilin LIU; Yike HU
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is the supporting materials to a paper entitled "Characterizing Data Reusability through A Data Citation Framework- A Case Study on Earth System Science Data". The dataset is mainly used to support different analysis within this manuscript, particularly on citation demographics, citation intensity, citation aging, and citation neworks.
Consumer price inflation consumption segment indices and price quotes
ons.gov.uk
cy.ons.gov.uk
csv
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Consumer price inflation consumption segment indices and price quotes [Dataset]. https://www.ons.gov.uk/economy/inflationandpriceindices/datasets/consumerpriceindicescpiandretailpricesindexrpiitemindicesandpricequotes
Explore at:
csvAvailable download formats
Dataset updated
Jun 18, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Price quote data (for locally collected data only) and consumption segment indices that underpin consumer price inflation statistics, giving users access to the detailed data that are used in the construction of the UK’s inflation figures. The data are being made available for research purposes only and are not an accredited official statistic. From October 2024, private school fees and part-time education classes have been included in the consumption segment indices file. For more information on the introduction of consumption segments, please see the Consumer Prices Indices Technical Manual, 2019. Note that this dataset was previously called the consumer price inflation item indices and price quotes dataset.
d
Louisville Metro KY - Uniform Citation Data 2020
catalog.data.gov
data.lojic.org
+2more
Updated Apr 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Uniform Citation Data 2020 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-uniform-citation-data-2020
Explore at:
Dataset updated
Apr 13, 2023
Dataset provided by
Louisville/Jefferson County Information Consortium
Area covered
Louisville, Kentucky
Description
A list of all uniform citations from the Louisville Metro Police Department, the CSV file is updated daily, including case number, date, location, division, beat, offender demographics, statutes and charges, and UCR codes can be found in this Link.INCIDENT_NUMBER or CASE_NUMBER links these data sets together:Crime DataUniform Citation DataFirearm intakeLMPD hate crimesAssaulted OfficersCITATION_CONTROL_NUMBER links these data sets together:Uniform Citation DataLMPD Stops DataNote: When examining this data, make sure to read the LMPDCrime Data section in our Terms of Use.AGENCY_DESC - the name of the department that issued the citationCASE_NUMBER - the number associated with either the incident or used as reference to store the items in our evidence rooms and can be used to connect the dataset to the following other datasets INCIDENT_NUMBER:1. Crime Data2. Firearms intake3. LMPD hate crimes4. Assaulted OfficersNOTE: CASE_NUMBER is not formatted the same as the INCIDENT_NUMBER in the other datasets. For example: in the Uniform Citation Data you have CASE_NUMBER 8018013155 (no dashes) which matches up with INCIDENT_NUMBER 80-18-013155 in the other 4 datasets.CITATION_YEAR - the year the citation was issuedCITATION_CONTROL_NUMBER - links this LMPD stops dataCITATION_TYPE_DESC - the type of citation issued (citations include: general citations, summons, warrants, arrests, and juvenile)CITATION_DATE - the date the citation was issuedCITATION_LOCATION - the location the citation was issuedDIVISION - the LMPD division in which the citation was issuedBEAT - the LMPD beat in which the citation was issuedPERSONS_SEX - the gender of the person who received the citationPERSONS_RACE - the race of the person who received the citation (W-White, B-Black, H-Hispanic, A-Asian/Pacific Islander, I-American Indian, U-Undeclared, IB-Indian/India/Burmese, M-Middle Eastern Descent, AN-Alaskan Native)PERSONS_ETHNICITY - the ethnicity of the person who received the citation (N-Not Hispanic, H=Hispanic, U=Undeclared)PERSONS_AGE - the age of the person who received the citationPERSONS_HOME_CITY - the city in which the person who received the citation livesPERSONS_HOME_STATE - the state in which the person who received the citation livesPERSONS_HOME_ZIP - the zip code in which the person who received the citation livesVIOLATION_CODE - multiple alpha/numeric code assigned by the Kentucky State Police to link to a Kentucky Revised Statute. For a full list of codes visit: https://kentuckystatepolice.org/crime-traffic-data/ASCF_CODE - the code that follows the guidelines of the American Security Council Foundation. For more details visit https://www.ascfusa.org/STATUTE - multiple alpha/numeric code representing a Kentucky Revised Statute. For a full list of Kentucky Revised Statute information visit: https://apps.legislature.ky.gov/law/statutes/CHARGE_DESC - the description of the type of charge for the citationUCR_CODE - the code that follows the guidelines of the Uniform Crime Report. For more details visit https://ucr.fbi.gov/UCR_DESC - the description of the UCR_CODE. For more details visit https://ucr.fbi.gov/
d
Louisville Metro KY - Uniform Citation Data 2022
catalog.data.gov
data.lojic.org
+5more
Updated Apr 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Uniform Citation Data 2022 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-uniform-citation-data-2022
Explore at:
Dataset updated
Apr 13, 2023
Dataset provided by
Louisville/Jefferson County Information Consortium
Area covered
Louisville, Kentucky
Description
Note: Due to a system migration, this data will cease to update on March 14th, 2023. The current projection is to restart the updates within 30 days of the system migration, on or around April 13th, 2023A list of all uniform citations from the Louisville Metro Police Department, the CSV file is updated daily, including case number, date, location, division, beat, offender demographics, statutes and charges, and UCR codes can be found in this Link.INCIDENT_NUMBER or CASE_NUMBER links these data sets together:Crime DataUniform Citation DataFirearm intakeLMPD hate crimesAssaulted OfficersCITATION_CONTROL_NUMBER links these data sets together:Uniform Citation DataLMPD Stops DataNote: When examining this data, make sure to read the LMPDCrime Data section in our Terms of Use.AGENCY_DESC - the name of the department that issued the citationCASE_NUMBER - the number associated with either the incident or used as reference to store the items in our evidence rooms and can be used to connect the dataset to the following other datasets INCIDENT_NUMBER:1. Crime Data2. Firearms intake3. LMPD hate crimes4. Assaulted OfficersNOTE: CASE_NUMBER is not formatted the same as the INCIDENT_NUMBER in the other datasets. For example: in the Uniform Citation Data you have CASE_NUMBER 8018013155 (no dashes) which matches up with INCIDENT_NUMBER 80-18-013155 in the other 4 datasets.CITATION_YEAR - the year the citation was issuedCITATION_CONTROL_NUMBER - links this LMPD stops dataCITATION_TYPE_DESC - the type of citation issued (citations include: general citations, summons, warrants, arrests, and juvenile)CITATION_DATE - the date the citation was issuedCITATION_LOCATION - the location the citation was issuedDIVISION - the LMPD division in which the citation was issuedBEAT - the LMPD beat in which the citation was issuedPERSONS_SEX - the gender of the person who received the citationPERSONS_RACE - the race of the person who received the citation (W-White, B-Black, H-Hispanic, A-Asian/Pacific Islander, I-American Indian, U-Undeclared, IB-Indian/India/Burmese, M-Middle Eastern Descent, AN-Alaskan Native)PERSONS_ETHNICITY - the ethnicity of the person who received the citation (N-Not Hispanic, H=Hispanic, U=Undeclared)PERSONS_AGE - the age of the person who received the citationPERSONS_HOME_CITY - the city in which the person who received the citation livesPERSONS_HOME_STATE - the state in which the person who received the citation livesPERSONS_HOME_ZIP - the zip code in which the person who received the citation livesVIOLATION_CODE - multiple alpha/numeric code assigned by the Kentucky State Police to link to a Kentucky Revised Statute. For a full list of codes visit: https://kentuckystatepolice.org/crime-traffic-data/ASCF_CODE - the code that follows the guidelines of the American Security Council Foundation. For more details visit https://www.ascfusa.org/STATUTE - multiple alpha/numeric code representing a Kentucky Revised Statute. For a full list of Kentucky Revised Statute information visit: https://apps.legislature.ky.gov/law/statutes/CHARGE_DESC - the description of the type of charge for the citationUCR_CODE - the code that follows the guidelines of the Uniform Crime Report. For more details visit https://ucr.fbi.gov/UCR_DESC - the description of the UCR_CODE. For more details visit https://ucr.fbi.gov/
r
Data Citation Corpus Data File
redivis.com
zenodo.org
Updated Apr 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Data Citation Corpus Data File [Dataset]. https://redivis.com/workflows/hx1e-a6w8vmwsx
Explore at:
Dataset updated
Apr 29, 2025
Description
Data file for the first release of the Data Citation Corpus, produced by DataCite and Make Data Count as part of an ongoing grant project funded by the Wellcome Trust.
Using Open Citation Databases for Snowballing in Software Engineering...
zenodo.org
data.niaid.nih.gov
bin, csv, pdf, zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leif Bonorden; Leif Bonorden (2024). Using Open Citation Databases for Snowballing in Software Engineering Research [Dataset]. http://doi.org/10.5281/zenodo.7938497
Explore at:
csv, bin, zip, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7938497
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Leif Bonorden; Leif Bonorden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset for our study on the coverage of software engineering articles in open citation databases:

a list of the 23 sampled venues with their respective CORE ranks and publishers,

01-venues.csv,

a list of the 204 sampled articles with their respective number of references/citations per citation database,

02-articles.csv (articles with publication information),

03-references-absolute.csv (number of references in published PDF & absolute numbers for reference coverage in databases),

04-references-relative.csv (relative numbers for reference coverage in databases),

05-citations-absolute.csv (absolute numbers for citation coverage in databases),

06-citations relative.csv (relative numbers for citation coverage in databases),

a list of the 8 articles analyzed in more detail with complete references data from the citation databases,

07-selected-articles.csv (articles with publication information),

08A–08H (comparison of references found in databases for each article),

and additional statistical measures and plots

09-Statistics.{pdf,xlsx} (statistical measures – i.e., minimum, maximum, median, average, variance – for the whole dataset and for subsets by publisher, CORE rank, or year of publication),

10-Figures.zip (figures for references as shown in the study and additional figures for citations – each in EPS and PNG format).
Data Citation Corpus
redivis.com
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redivis Demo Organization (2025). Data Citation Corpus [Dataset]. https://redivis.com/datasets/am5t-e9jvcn6s5
Explore at:
Dataset updated
May 13, 2025
Dataset provided by
Redivis Inc.
Authors
Redivis Demo Organization
Time period covered
Jan 1, 1839 - Oct 1, 2115
Description
The table Data Citation Corpus is part of the dataset Data Citation Corpus Data File, available at https://redivis.com/datasets/am5t-e9jvcn6s5. It contains 5256114 rows across 14 variables.
A dataset from a survey investigating disciplinary differences in data...
zenodo.org
data.niaid.nih.gov
bin, csv, pdf, txt
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
Explore at:
txt, pdf, bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7853477
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
GENERAL INFORMATION

Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

Date of data collection: January to March 2022

Collection instrument: SurveyMonkey

Funding: Alfred P. Sloan Foundation

SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

Links to publications that cite or use the data:

Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266

DATA & FILE OVERVIEW

File List

Filename: MDCDatacitationReuse2021Codebookv2.pdf
Codebook

Filename: MDCDataCitationReuse2021surveydatav2.csv
Dataset format in csv

Filename: MDCDataCitationReuse2021surveydatav2.sav
Dataset format in SPSS

Filename: MDCDataCitationReuseSurvey2021QNR.pdf
Questionnaire

Additional related data collected that was not included in the current data package: Open ended questions asked to respondents

METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data:

The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

Methods for processing the data:

Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

Instrument- or software-specific information needed to interpret the data:

The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.

DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

Number of variables: 95

Number of cases/rows: 2,492

Missing data codes: 999 Not asked

Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.
n
Data from: A stochastic generative model for citation networks among...
data.niaid.nih.gov
search.dataone.org
+2more
zip
Updated Jun 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuichiro Yasui (2022). A stochastic generative model for citation networks among academic papers [Dataset]. http://doi.org/10.5061/dryad.z8w9ghxfh
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.z8w9ghxfh
Dataset updated
Jun 5, 2022
Dataset provided by
The Graduate University for Advanced Studies, SOKENDAI
Authors
Yuichiro Yasui
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
We propose a stochastic generative model to represent a directed graph constructed by citations among academic papers, where nodes and directed edges represent papers with discrete publication time and citations respectively. The proposed model assumes that a citation between two papers occurs with a probability based on the type of the citing paper, the importance of cited paper, and the difference between their publication times, like the existing models. We consider the out-degrees of citing paper as its type, because, for example, survey paper cites many papers. We approximate the importance of a cited paper by its in-degrees. In our model, we adopt three functions: a logistic function for illustrating the numbers of papers published in discrete time, an inverse Gaussian probability distribution function to express the aging effect based on the difference between publication times, and an exponential distribution (or a generalized Pareto distribution) for describing the out-degree distribution. We consider that our model is a more reasonable and appropriate stochastic model than other existing models and can perform complete simulations without using original data. In this paper, we first use the Web of Science database and see the features used in our model. By using the proposed model, we can generate simulated graphs and demonstrate that they are similar to the original data concerning the in- and out-degree distributions, and node triangle participation. In addition, we analyze two other citation networks derived from physics papers in the arXiv database and verify the effectiveness of the model. Methods We focus on a subset of the Web of Science (WoS), WoS-Stat, which is a citation network that comprises the citations between papers published in journals whose subject is associated with “Statistics and Probability.” We construct a citation network utilizing a paper identifier (ID), publication year, and reference list (list of paper IDs) for 36 years, from 1981 to 2016. WoS-Stat consists of 179,483 papers and 1,106,622 citations.
Data behind State of Open Data 2024 Special Report: Bridging policy and...
figshare.com
xlsx
Updated Nov 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Hahnel (2024). Data behind State of Open Data 2024 Special Report: Bridging policy and practice in data sharing - Country, Funder and Affiliation Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27900828.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27900828.v1
Dataset updated
Nov 27, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Mark Hahnel
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This Dataset contains 3 datasets behind graphs generated in the "State of Open Data 2024 Special Report: Bridging policy and practice in data sharing" The datasets include counts and percentages for papers that link to datasets filtered by Country, Funder and Affiliation DatasetsThe datasets were generated by combining the DataCite Data Citation Corpus (https://corpus.datacite.org/dashboard) with Dimensions (https://www.dimensions.ai/) in Google big query.
f
Statistical information of data set D2.
plos.figshare.com
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahzad Nazir; Muhammad Asif; Shahbaz Ahmad; Faisal Bukhari; Muhammad Tanvir Afzal; Hanan Aljuaid (2023). Statistical information of data set D2. [Dataset]. http://doi.org/10.1371/journal.pone.0228885.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0228885.t003
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Shahzad Nazir; Muhammad Asif; Shahbaz Ahmad; Faisal Bukhari; Muhammad Tanvir Afzal; Hanan Aljuaid
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistical information of data set D2.
F
Infra-Annual Labor Statistics: Employment Total: From 15 to 64 Years for...
fred.stlouisfed.org
json
Updated Jun 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Infra-Annual Labor Statistics: Employment Total: From 15 to 64 Years for United States [Dataset]. https://fred.stlouisfed.org/series/LFEM64TTUSQ647S
Explore at:
jsonAvailable download formats
Dataset updated
Jun 16, 2025
License
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Area covered
United States
Description
Graph and download economic data for Infra-Annual Labor Statistics: Employment Total: From 15 to 64 Years for United States (LFEM64TTUSQ647S) from Q1 1970 to Q1 2025 about 15 to 64 years, employment, and USA.
l
Louisville Metro KY - Uniform Citation Data 2023
data.louisvilleky.gov
data.lojic.org
+4more
Updated Jan 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Uniform Citation Data 2023 [Dataset]. https://data.louisvilleky.gov/datasets/louisville-metro-ky-uniform-citation-data-2023-1/about
Explore at:
Dataset updated
Jan 4, 2023
Dataset authored and provided by
Louisville/Jefferson County Information Consortium
License
https://louisville-metro-opendata-lojic.hub.arcgis.com/pages/terms-of-use-and-licensehttps://louisville-metro-opendata-lojic.hub.arcgis.com/pages/terms-of-use-and-license
Area covered
Louisville, Kentucky
Description
Note: Due to a system migration, this data will cease to update on March 14th, 2023. At this time we are updating this dataset manually once per month as resources allow. For real time crime data please utilize communitycrimemap.comA list of all uniform citations from the Louisville Metro Police Department, the CSV file is updated daily, including case number, date, location, division, beat, offender demographics, statutes and charges, and UCR codes can be found in this Link.INCIDENT_NUMBER or CASE_NUMBER links these data sets together:Crime DataUniform Citation DataFirearm intakeLMPD hate crimesAssaulted OfficersCITATION_CONTROL_NUMBER links these data sets together:Uniform Citation DataLMPD Stops DataNote: When examining this data, make sure to read the LMPDCrime Data section in our Terms of Use.AGENCY_DESC - the name of the department that issued the citationCASE_NUMBER - the number associated with either the incident or used as reference to store the items in our evidence rooms and can be used to connect the dataset to the following other datasets INCIDENT_NUMBER:1. Crime Data2. Firearms intake3. LMPD hate crimes4. Assaulted OfficersNOTE: CASE_NUMBER is not formatted the same as the INCIDENT_NUMBER in the other datasets. For example: in the Uniform Citation Data you have CASE_NUMBER 8018013155 (no dashes) which matches up with INCIDENT_NUMBER 80-18-013155 in the other 4 datasets.CITATION_YEAR - the year the citation was issuedCITATION_CONTROL_NUMBER - links this LMPD stops dataCITATION_TYPE_DESC - the type of citation issued (citations include: general citations, summons, warrants, arrests, and juvenile)CITATION_DATE - the date the citation was issuedCITATION_LOCATION - the location the citation was issuedDIVISION - the LMPD division in which the citation was issuedBEAT - the LMPD beat in which the citation was issuedPERSONS_SEX - the gender of the person who received the citationPERSONS_RACE - the race of the person who received the citation (W-White, B-Black, H-Hispanic, A-Asian/Pacific Islander, I-American Indian, U-Undeclared, IB-Indian/India/Burmese, M-Middle Eastern Descent, AN-Alaskan Native)PERSONS_ETHNICITY - the ethnicity of the person who received the citation (N-Not Hispanic, H=Hispanic, U=Undeclared)PERSONS_AGE - the age of the person who received the citationPERSONS_HOME_CITY - the city in which the person who received the citation livesPERSONS_HOME_STATE - the state in which the person who received the citation livesPERSONS_HOME_ZIP - the zip code in which the person who received the citation livesVIOLATION_CODE - multiple alpha/numeric code assigned by the Kentucky State Police to link to a Kentucky Revised Statute. For a full list of codes visit: https://kentuckystatepolice.org/crime-traffic-data/ASCF_CODE - the code that follows the guidelines of the American Security Council Foundation. For more details visit https://www.ascfusa.org/STATUTE - multiple alpha/numeric code representing a Kentucky Revised Statute. For a full list of Kentucky Revised Statute information visit: https://apps.legislature.ky.gov/law/statutes/CHARGE_DESC - the description of the type of charge for the citationUCR_CODE - the code that follows the guidelines of the Uniform Crime Report. For more details visit https://ucr.fbi.gov/UCR_DESC - the description of the UCR_CODE. For more details visit https://ucr.fbi.gov/
F
Nominal Statistical Discrepancy for Great Britain
fred.stlouisfed.org
json
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Nominal Statistical Discrepancy for Great Britain [Dataset]. https://fred.stlouisfed.org/series/NSDGDPSAXDCGBQ
Explore at:
jsonAvailable download formats
Dataset updated
Jun 9, 2025
License
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Area covered
United Kingdom
Description
Graph and download economic data for Nominal Statistical Discrepancy for Great Britain (NSDGDPSAXDCGBQ) from Q1 1995 to Q1 2025 about residual and United Kingdom.
I
Data from: Second-generation citation context analysis (2010-2019) to...
databank.illinois.edu
Updated Sep 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jodi Schneider; Di Ye; Alison Hill (2020). Second-generation citation context analysis (2010-2019) to retracted paper Matsuyama 2005 [Dataset]. http://doi.org/10.13012/B2IDB-3331845_V2
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-3331845_V2
Dataset updated
Sep 2, 2020
Authors
Jodi Schneider; Di Ye; Alison Hill
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset funded by
Alfred P. Sloan Foundation
Description
Citation context annotation. This dataset is a second version (V2) and part of the supplemental data for Jodi Schneider, Di Ye, Alison Hill, and Ashley Whitehorn. (2020) "Continued post-retraction citation of a fraudulent clinical trial report, eleven years after it was retracted for falsifying data". Scientometrics. In press, DOI: 10.1007/s11192-020-03631-1 Publications were selected by examining all citations to the retracted paper Matsuyama 2005, and selecting the 35 citing papers, published 2010 to 2019, which do not mention the retraction, but which mention the methods or results of the retracted paper (called "specific" in Ye, Di; Hill, Alison; Whitehorn (Fulton), Ashley; Schneider, Jodi (2020): Citation context annotation for new and newly found citations (2006-2019) to retracted paper Matsuyama 2005. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8150563_V1 ). The annotated citations are second-generation citations to the retracted paper Matsuyama 2005 (RETRACTED: Matsuyama W, Mitsuyama H, Watanabe M, Oonakahara KI, Higashimoto I, Osame M, Arimura K. Effects of omega-3 polyunsaturated fatty acids on inflammatory markers in COPD. Chest. 2005 Dec 1;128(6):3817-27.), retracted in 2008 (Retraction in: Chest (2008) 134:4 (893) https://doi.org/10.1016/S0012-3692(08)60339-6). OVERALL DATA for VERSION 2 (V2) FILES/FILE FORMATS Same data in two formats: 2010-2019 SG to specific not mentioned FG.csv - Unicode CSV (preservation format only) - same as in V1 2010-2019 SG to specific not mentioned FG.xlsx - Excel workbook (preferred format) - same as in V1 Additional files in V2: 2G-possible-misinformation-analyzed.csv - Unicode CSV (preservation format only) 2G-possible-misinformation-analyzed.xlsx - Excel workbook (preferred format) ABBREVIATIONS: 2G - Refers to the second-generation of Matsuyama FG - Refers to the direct citation of Matsuyama (the one the second-generation item cites) COLUMN HEADER EXPLANATIONS File name: 2G-possible-misinformation-analyzed. Other column headers in this file have same meaning as explained in V1. The following are additional header explanations: Quote Number - The order of the quote (citation context citing the first generation article given in "FG in bibliography") in the second generation article (given in "2G article") Quote - The text of the quote (citation context citing the first generation article given in "FG in bibliography") in the second generation article (given in "2G article") Translated Quote - English translation of "Quote", automatically translation from Google Scholar Seriousness/Risk - Our assessment of the risk of misinformation and its seriousness 2G topic - Our assessment of the topic of the cited article (the second generation article given in "2G article") 2G section - The section of the citing article (the second generation article given in "2G article") in which the cited article(the first generation article given in "FG in bibliography") was found FG in bib type - The type of article (e.g., review article), referring to the cited article (the first generation article given in "FG in bibliography") FG in bib topic - Our assessment of the topic of the cited article (the first generation article given in "FG in bibliography") FG in bib section - The section of the cited article (the first generation article given in "FG in bibliography") in which the Matsuyama retracted paper was cited
Data from: Dataset for 'A Matter of Culture? Conceptualising and...
zenodo.org
csv
Updated Oct 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rhodri Leng; Rhodri Leng; Justyna Bandola-Gill; Justyna Bandola-Gill; Katherine Smith; Katherine Smith; Valerie Pattyn; Valerie Pattyn; Niklas Andersen; Niklas Andersen (2024). Dataset for 'A Matter of Culture? Conceptualising and Investigating 'Evidence Cultures' within Research on Evidence-Informed Policymaking' [Dataset]. http://doi.org/10.5281/zenodo.13972074
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13972074
Dataset updated
Oct 22, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rhodri Leng; Rhodri Leng; Justyna Bandola-Gill; Justyna Bandola-Gill; Katherine Smith; Katherine Smith; Valerie Pattyn; Valerie Pattyn; Niklas Andersen; Niklas Andersen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 22, 2024
Description
Introduction
This document describes the data collection and datasets used in the manuscript "A Matter of Culture? Conceptualising and Investigating ‘Evidence Cultures’ within Research on Evidence-Informed Policymaking" [1].

Data Collection

To construct the citation network analysed in the manuscript, we first designed a series of queries to capture a large sample of literature exploring the relationship between evidence, policy, and culture from various perspectives. Our team of domain experts developed the following queries based on terms common in the literature. These queries search for the terms included in the titles, abstracts, and associated keywords of WoS indexed records (i.e. ‘TS=’). While these are separated below for ease of reading, they combined into a single query via the OR operator in our search. Our search was conducted on the Web of Science’s (WoS) Core Collection through the University of Edinburgh Library subscription on 29/11/2023, returning a total of 2,089 records.

TS = ((“cultures of evidence” OR “culture of evidence” OR “culture of knowledge” OR “cultures of knowledge” OR “research culture” OR “research cultures” OR “culture of research” OR “cultures of research” OR “epistemic culture” OR “epistemic cultures” OR “epistemic community” OR “epistemic communities” OR “epistemic infrastructure” OR “evaluation culture” OR “evaluation cultures” OR “culture of evaluation” OR “cultures of evaluation” OR “thought style” OR “thought styles” OR “thought collective” OR “thought collectives” OR “knowledge regime” OR “knowledge regimes” OR “knowledge system” OR “knowledge systems” OR “civic epistemology” OR “civic epistemologies”) AND (“policy” OR “policies” OR “policymaking” OR “policy making” OR “policymaker” OR “policymakers” OR “policy maker” OR “policy makers” OR “policy decision” OR “policy decisions” OR “political decision” OR “political decisions” OR “political decision making”))

OR

TS = ((“culture” OR “cultures”) AND ((“evidence-based” OR “evidence-informed” OR “evidence-led” OR “science-based” OR “science-informed” OR “science-led” OR “research-based” OR “research-informed” OR “evidence use” OR “evidence user” OR “evidence utilisation” OR “evidence utilization” OR “research use” OR “researcher user” OR “research utilisation” OR “research utilization” OR “research in” OR “evidence in” OR “science in”) NEAR/1 (“policymaking” OR “policy making” OR “policy maker” OR “policy makers”)))

OR

TS = ((“culture” OR “cultures”) AND (“scientific advice” OR “technical advice” OR “scientific expertise” OR “technical expertise” OR “expert advice”) AND (“policy” OR “policies” OR “policymaking” OR “policy making” OR “policymaker” OR “policymakers” OR “policy maker” OR “policy makers” OR “political decision” OR “political decisions” OR “political decision making”))

OR

TS = ((“culture” OR “cultures”) AND (“post-normal science” OR “trans-science” OR “transdisciplinary” OR “transdisiplinarity” OR “science-policy interface” OR “policy sciences” OR “sociology of knowledge” OR “sociology of science” OR “knowledge transfer” OR “knowledge translation” OR “knowledge broker” OR “implementation science” OR “risk society”) AND (“policymaking” OR “policy making” OR “policymaker” OR “policymakers” OR “policy maker” OR “policy makers”))

Citation Network Construction

All bibliographic metadata on these 2,089 records were downloaded in five batches in plain text and then merged in R. We then parsed these data into network readable files. All unique reference strings are given unique node IDs. A node-attribute-list (‘CE_Node’) links identifying information of each document with its node ID, including authors, title, year of publication, journal WoS ID, and WoS citations. An edge-list (‘CE_Edge’) records all citations from these documents to their bibliographies – with edges going from a citing document to the cited – using the relevant node IDs. These data were then cleaned by (a) matching DOIs for reference strings that differ but point to the same paper, and (b) manual merging of obvious duplicates caused by referencing errors.

Our initial dataset consisted of 2,089 retrieved documents and 123,772 unretrieved cited documents (i.e. documents that were cited within the publications we retrieved but which were not one of these 2,089 documents). These documents were connected by 157,229 citation links, but ~87% of the documents in the network were cited just once. To focus on relevant literature, we filtered the network to include only documents with at least three citation or reference links. We further refined the dataset by focusing on the main connected component, resulting in 6,650 nodes and 29,198 edges. It is this dataset that we publish here, and it is this network that underpins Figure 1, Table 1, and the qualitative examination of documents (see manuscript for further details).

Our final network dataset contains 1,819 of the documents in our original query (~87% of the original retrieved records), and 4,831 documents not retrieved via our Web of Science search but cited by at least three of the retrieved documents. We then clustered this network by modularity maximization via the Leiden algorithm [2], detecting 14 clusters with Q=0.59. Citations to documents within the same cluster constitute ~77% of all citations in the network.

Citation Network Dataset Description

We include two network datasets: (i) ‘CE_Node.csv’ that contains 1,819 retrieved documents, 4,831 unretrieved referenced documents, making for a total of 6,650 documents (nodes); (ii)’CE_Edge.csv’ that records citations (edges) between the documents (nodes), including a total of 29,198 citation links. These files can be used to construct a network with many different tools, but we have formatted these to be used in Gephi 0.10[3].

‘CE_Node.csv’ is a comma-separate values file that contains two types of nodes:

i. Retrieved documents – these are documents captured by our query. These include full bibliographic metadata and reference lists.

ii. Non-retrieved documents – these are documents referenced by our retrieved documents but were not retrieved via our query. These only have data contained within their reference string (i.e. first author, journal or book title, year of publication, and possibly DOI).

The columns in the .csv refer to:

- Id, the node ID

- Label, the reference string of the document

- DOI, the DOI for the document, if available

- WOS_ID, WoS accession number

- Authors, named authors

- Title, title of document

- Document_type, variable indicating whether a document is an article, review, etc.

- Journal_book_title, journal of publication or title of book

- Publication year, year of publication.

- WOS_times_cited, total Core Collection citations as of 29/11/2023

- Indegree, number of within network citations to a given document

- Cluster, provides the cluster membership number as discussed in the manuscript (Figure 1)

‘CE_Edge.csv’ is a comma-separated values file that contains edges (citation links) between nodes (documents) (n=29,198). The columns refer to:

- Source, node ID of the citing document

- Target, node ID of the cited document

Cluster Analysis

We qualitatively analyse a set of publications from seven of the largest clusters in our manuscript.

Facebook

Twitter

Click to copy link

Link copied

Cite

Heather A. Piwowar; Todd J. Vision (2013). Data reuse and the open data citation advantage [Dataset]. http://doi.org/10.5061/dryad.781pv

Data from: Data reuse and the open data citation advantage

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.781pv

Dataset updated

Oct 1, 2013

Dataset provided by

National Evolutionary Synthesis Center

Authors

Heather A. Piwowar; Todd J. Vision

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results: Here, we look at citation rates while controlling for many known citation predictors, and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered.We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

Clear search

Close search

Google apps

Main menu

Data from: Data reuse and the open data citation advantage

OpenCitations Index N-Triples dataset of all the citation data

Data from: National citation patterns of NEJM, The Lancet, JAMA and The BMJ...

The major statistical data of natural referencing

ESSD Data Citation Statistics (2014-2023)

Consumer price inflation consumption segment indices and price quotes

Louisville Metro KY - Uniform Citation Data 2020

Louisville Metro KY - Uniform Citation Data 2022

Data Citation Corpus Data File

Using Open Citation Databases for Snowballing in Software Engineering...

Data Citation Corpus

A dataset from a survey investigating disciplinary differences in data...

Data from: A stochastic generative model for citation networks among...

Data behind State of Open Data 2024 Special Report: Bridging policy and...

Statistical information of data set D2.

Infra-Annual Labor Statistics: Employment Total: From 15 to 64 Years for...

Louisville Metro KY - Uniform Citation Data 2023

Nominal Statistical Discrepancy for Great Britain

Data from: Second-generation citation context analysis (2010-2019) to...

Data from: Dataset for 'A Matter of Culture? Conceptualising and...

Data from: Data reuse and the open data citation advantage