Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary material to an analysis on data citation practices based on the Data Citation Index from Thomson Reuters. This database launched in 2012 aims to link data sets and data studies with citation received from the rest of their citation indexes. Funding bodies and research organizations are increasingly demanding the need of researchers to make their scientific data available in a reusable and reproducible manner, aiming to maximize the allocation of funding while providing transparency on the scientific process. The DCI harvests citations to research data from papers indexed in the Web of Knowledge. It relies on the information provided by the data repository as data citation practices are inconsistent or inexistent in many cases. The findings of this study show that data citation practices are far from common in most research fields.. Some differences have been reported on the way researchers cite data: while in the areas of Science and Engineering & Technology data sets were the most cited, in Social Sciences and Arts & Humanities data studies play a greater role. 88.1% of the records have received no citation, but some repositories show very low uncitedness rates. While data citation practices are rare in most fields, they have expanded in disciplines such as Crystallography or Genomics. We conclude by emphasizing the role the DCI may play to encourage consistent and standardized citation of research data which will allow considering its use on following the research process developed by researchers, from data collection to publication.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The file contains the number of datasets published by the researchers affiliated to Most Wiedzy and indexed in Data Citation Index by Web of Science. The Search was perfprmed using the name of institution in the 'assress' filed or 'group author' field. Data retrieved and published during the 5th Open Science Conference (1-3.12.2021).
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given and data on retracted papers (based on Retraction Watch database) as well as citations to/from retracted papers have been added. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2024 and single recent year data pertain to citations received during calendar year 2024. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (7) is based on the August 1, 2025 snapshot from Scopus, updated to end of citation year 2024. This work uses Scopus data. Calculations were performed using all Scopus author profiles as of August 1, 2025. If an author is not on the list, it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work. PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases. The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, see attached file on FREQUENTLY ASKED QUESTIONS. Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset comprises a single list of datasets exported from Data Citation Index (Web of Science, Clarivate Analytics) in the Astronomy and Astrophysics category, for the period 2010 - 2019, allowing to identify annual evolution, countries and institutions with higher productivity, main repositories and hosting platforms, use in publications indexed in Web of Science.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research output of countries and Spanish universities for the 2010-2014 period according to Thomson Reuters' citation indexes: SCI, SSCI, A&HCI, CPCI, BKCI, DCI
Facebook
TwitterThis dataset provides information on the H-index and citations of computer science researchers. The H-index is a measure of a researcher's productivity and impact. The higher the H-index, the more productive and influential the researcher is. Citations are another way of measuring a researcher's impact. The more citations a researcher has, the more other researchers have cited their work. This dataset can be used to compare the productivity and impact of computer science researchers
To use this dataset, simply download it and import it into your favorite statistical software. Then, you can begin to analyze the data in order to answer any questions that you may have about computer science researchers and their impact
File: data.csv | Column name | Description | |:------------------------|:-----------------------------------------------------------------| | Name | The name of the researcher. (String) | | Citations 2020 | The number of citations the researcher has in 2020. (Integer) | | Total_citation | The total number of citations the researcher has. (Integer) | | Citation_since_2016 | The number of citations the researcher has since 2016. (Integer) | | HomePage | The researcher's home page. (String) | | Area of Research | The researcher's area of research. (String) | | Google_Scholar | The researcher's Google Scholar page. (String) |
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains all the citation data (in N-Triples format) included in the OpenCitations Index, released on July 10, 2025. In particular, any citation in the dataset, defined as an individual of the class cito:Citation, includes the following information:[citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the final part of the URL identifying the citation (https://w3id.org/oc/index/ci/[OCI]);[property "cito:hasCitingEntity"] the citing entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitedEntity"] the cited entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitationCreationDate"] the creation date of the citation (i.e. the publication date of the citing entity);[property "cito:hasCitationTimeSpan"] the time span of the citation (i.e. the interval between the publication date of the cited entity and the publication date of the citing entity);[type "cito:JournalSelfCitation"] it records whether the citation is a journal self-citations (i.e. the citing and the cited entities are published in the same journal);[type "cito:AuthorSelfCitation"] it records whether the citation is an author self-citation (i.e. the citing and the cited entities have at least one author in common).Note: the information for each citation is sourced from OpenCitations Meta (https://opencitations.net/meta), a database that stores and delivers bibliographic metadata for all bibliographic resources included in the OpenCitations Indexes. The data provided in this dump is therefore based on the state of OpenCitations Meta at the time this collection was generated.This version of the dataset contains:2,216,426,689 citationsThe size of the zipped archive is 87.4 GB, while the size of the unzipped N-Triples files is 2.1 TB.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains all the citation data (in CSV format) included in POCI, released on 27 December 2022. In particular, each line of the CSV file defines a citation, and includes the following information:
[field "oci"] the Open Citation Identifier (OCI) for the citation; [field "citing"] the PMID of the citing entity; [field "cited"] the PMID of the cited entity; [field "creation"] the creation date of the citation (i.e. the publication date of the citing entity); [field "timespan"] the time span of the citation (i.e. the interval between the publication date of the cited entity and the publication date of the citing entity); [field "journal_sc"] it records whether the citation is a journal self-citations (i.e. the citing and the cited entities are published in the same journal); [field "author_sc"] it records whether the citation is an author self-citation (i.e. the citing and the cited entities have at least one author in common).
This version of the dataset contains:
717,654,703 citations; 26,024,862 bibliographic resources.
The size of the zipped archive is 9.6 GB, while the size of the unzipped CSV file is 50 GB. Additional information about POCI at official webpage.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The file contains the number of datasetes published by the reserchers affiliated to Most Wiedzy and indexed in Data Citation Index provided by Web of Science. The Search was performed using the name of institution in the 'address' filed or 'group author' filed . Data retrieved and published during the '5th Open Science Conference (1-3.12.2021).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This note describes the data sets used for all analyses contained in the manuscript 'Oxytocin - a social peptide?’[1] that is currently under review.
Data Collection
The data sets described here were originally retrieved from Web of Science (WoS) Core Collection via the University of Edinburgh’s library subscription [2]. The aim of the original study for which these data were gathered was to survey peer-reviewed primary studies on oxytocin and social behaviour. To capture relevant papers, we used the following query:
TI = (“oxytocin” OR “pitocin” OR “syntocinon”) AND TS = (“social*” OR “pro$social” OR “anti$social”)
The final search was performed on the 13 September 2021. This returned a total of 2,747 records, of which 2,049 were classified by WoS as ‘articles’. Given our interest in primary studies only – articles reporting original data – we excluded all other document types. We further excluded all articles sub-classified as ‘book chapters’ or as ‘proceeding papers’ in order to limit our analysis to primary studies published in peer-reviewed academic journals. This reduced the set to 1,977 articles. All of these were published in the English language, and no further language refinements were unnecessary.
All available metadata on these 1,977 articles was exported as plain text ‘flat’ format files in four batches, which we later merged together via Notepad++. Upon manually examination, we discovered examples of papers classified as ‘articles’ by WoS that were, in fact, reviews. To further filter our results, we searched all available PMIDs in PubMed (1,903 had associated PMIDs - ~96% of set). We then filtered results to identify all records classified as ‘review’, ‘systematic review’, or ‘meta-analysis’, identifying 75 records 3. After examining a sample and agreeing with the PubMed classification, these were removed these from our dataset - leaving a total of 1,902 articles.
From these data, we constructed two datasets via parsing out relevant reference data via the Sci2 Tool [4]. First, we constructed a ‘node-attribute-list’ by first linking unique reference strings (‘Cite Me As’ column in WoS data files) to unique identifiers, we then parsed into this dataset information on the identify of a paper, including the title of the article, all authors, journal publication, year of publication, total citations as recorded from WoS, and WoS accession number. Second, we constructed an ‘edge-list’ that records the citations from a citing paper in the ‘Source’ column and identifies the cited paper in the ‘Target’ column, using the unique identifies as described previously to link these data to the node-attribute-list.
We then constructed a network in which papers are nodes, and citation links between nodes are directed edges between nodes. We used Gephi Version 0.9.2 [5] to manually clean these data by merging duplicate references that are caused by different reference formats or by referencing errors. To do this, we needed to retain both all retrieved records (1,902) as well as including all of their references to papers whether these were included in our original search or not. In total, this produced a network of 46,633 nodes (unique reference strings) and 112,520 edges (citation links). Thus, the average reference list size of these articles is ~59 references. The mean indegree (within network citations) is 2.4 (median is 1) for the entire network reflecting a great diversity in referencing choices among our 1,902 articles.
After merging duplicates, we then restricted the network to include only articles fully retrieved (1,902), and retrained only those that were connected together by citations links in a large interconnected network (i.e. the largest component). In total, 1,892 (99.5%) of our initial set were connected together via citation links, meaning a total of ten papers were removed from the following analysis – and these were neither connected to the largest component, nor did they form connections with one another (i.e. these were ‘isolates’).
This left us with a network of 1,892 nodes connected together by 26,019 edges. It is this network that is described by the ‘node-attribute-list’ and ‘edge-list’ provided here. This network has a mean in-degree of 13.76 (median in-degree of 4). By restricting our analysis in this way, we lose 44,741 unique references (96%) and 86,501 citations (77%) from the full network, but retain a set of articles tightly knitted together, all of which have been fully retrieved due to possessing certain terms related to oxytocin AND social behaviour in their title, abstract, or associated keywords.
Before moving on, we calculated indegree for all nodes in this network – this counts the number of citations to a given paper from other papers within this network – and have included this in the node-attribute-list. We further clustered this network via modularity maximisation via the Leiden algorithm [6]. We set the algorithm to resolution 1, and allowed the algorithm to run over 100 iterations and 100 restarts. This gave Q=0.43 and identified seven clusters, which we describe in detail within the body of the paper. We have included cluster membership as an attribute in the node-attribute-list.
Data description
We include here two datasets: (i) ‘OTSOC-node-attribute-list.csv’ consists of the attributes of 1,892 primary articles retrieved from WoS that include terms indicating a focus on oxytocin and social behaviour; (ii) ‘OTSOC-edge-list.csv’ records the citations between these papers. Together, these can be imported into a range of different software for network analysis; however, we have formatted these for ease of upload into Gephi 0.9.2. Below, we detail their contents:
Id, the unique identifier
Label, the reference string of the paper to which the attributes in this row correspond. This is taken from the ‘Cite Me As’ column from the original WoS download. The reference string is in the following format: last name of first author, publication year, journal, volume, start page, and DOI (if available).
Wos_id, unique Web of Science (WoS) accession number. These can be used to query WoS to find further data on all papers via the ‘UT= ’ field tag.
Title, paper title.
Authors, all named authors.
Journal, journal of publication.
Pub_year, year of publication.
Wos_citations, total number of citations recorded by WoS Core Collection to a given paper as of 13 September 2021
Indegree, the number of within network citations to a given paper, calculated for the network shown in Figure 1 of the manuscript.
Cluster, provides the cluster membership number as discussed within the manuscript (Figure 1). This was established via modularity maximisation via the Leiden algorithm (Res 1; Q=0.43|7 clusters)
Source, the unique identifier of the citing paper.
Target, the unique identifier of the cited paper.
Type, edges are ‘Directed’, and this column tells Gephi to regard all edges as such.
Syr_date, this contains the date of publication of the citing paper.
Tyr_date, this contains the date of publication of the cited paper.
Software recommended for analysis
Gephi version 0.9.2 was used for the visualisations within the manuscript, and both files can be read and into Gephi without modification.
Notes
[1] Leng, G., Leng, R. I., Ludwig, M. (Submitted). Oxytocin – a social peptide? Deconstructing the evidence.
[2] Edinburgh University’s subscription to Web of Science covers the following databases: (i) Science Citation Index Expanded, 1900-present; (ii) Social Sciences Citation Index, 1900-present; (iii) Arts & Humanities Citation Index, 1975-present; (iv) Conference Proceedings Citation Index- Science, 1990-present; (v) Conference Proceedings Citation Index- Social Science & Humanities, 1990-present; (vi) Book Citation Index– Science, 2005-present; (vii) Book Citation Index– Social Sciences & Humanities, 2005-present; (viii) Emerging Sources Citation Index, 2015-present.
[3] For those interested, the following PMIDs were identified as ‘articles’ by WoS, but as ‘reviews’ by PubMed: ‘34502097’ ‘33400920’ ‘32060678’ ‘31925983’ ‘31734142’ ‘30496762’ ‘30253045’ ‘29660735’ ‘29518698’ ‘29065361’ ‘29048602’ ‘28867943’ ‘28586471’ ‘28301323’ ‘27974283’ ‘27626613’ ‘27603523’ ‘27603327’ ‘27513442’ ‘27273834’ ‘27071789’ ‘26940141’ ‘26932552’ ‘26895254’ ‘26869847’ ‘26788924’ ‘26581735’ ‘26548910’ ‘26317636’ ‘26121678’ ‘26094200’ ‘25997760’ ‘25631363’ ‘25526824’ ‘25446893’ ‘25153535’ ‘25092245’ ‘25086828’ ‘24946432’ ‘24637261’ ‘24588761’ ‘24508579’ ‘24486356’ ‘24462936’ ‘24239932’ ‘24239931’ ‘24231551’ ‘24216134’ ‘23955310’ ‘23856187’ ‘23686025’ ‘23589638’ ‘23575742’ ‘23469841’ ‘23055480’ ‘22981649’ ‘22406388’ ‘22373652’ ‘22141469’ ‘21960250’ ‘21881219’ ‘21802859’ ‘21714746’ ‘21618004’ ‘21150165’ ‘20435805’ ‘20173685’ ‘19840865’ ‘19546570’ ‘19309413’ ‘15288368’ ‘12359512’ ‘9401603’ ‘9213136’ ‘7630585’
[4] Sci2 Team. (2009). Science of Science (Sci2) Tool. Indiana University and SciTech Strategies. Stable URL: https://sci2.cns.iu.edu
[5] Bastian, M., Heymann, S., & Jacomy, M. (2009).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains data source collection (e.g., COCI, DOCI, POCI, etc) information about all the citation data (in N-Triples format) included in the OpenCitations Index, released on July 10, 2025. In particular, any citation in the dataset, defined as an individual of the class cito:Citation, includes the following information:[property "prov:atLocation"] the data source entity identified by its URL (https://w3id.org/oc/index/[DATA-SOURCE]/);This version of the dataset contains:2,693,728,426 citationsThe size of the zipped archive is 25.7 GB, while the size of the unzipped N-Triples files is 426 GB.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains data source collection (e.g., COCI, DOCI, POCI, etc) information about all the citation data (in CSV format) included in the OpenCitations Index, released on July 10, 2025. In particular, any citation in the dataset, defined with its corresponding OCI (first column) has a corresponding value that defines the source (second column), e.g. "coci", "doci", "poci", etc.This version of the dataset contains:2,693,728,426 citationsThe size of the zipped archive is 23 GB, while the size of the unzipped CSV files is 104 GB.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the provenance information (in CSV format) of all the citation data included in the OpenCitations Index, released on July 13, 2025. In particular, each line of the CSV file defines a citation, and includes the following information:[field "oci"] the Open Citation Identifier (OCI) for the citation;[field "snapshot"] the identifier of the snapshot;[field "agent"] the name of the agent that have created the citation data;[field "source"] the URL of the source dataset from where the citation data have been extracted;[field "created"] the creation time of the citation data.[field "invalidated"] the start of the destruction, cessation, or expiry of an existing entity by an activity;[field "description"] a textual description of the activity made;[field "update"] the UPDATE SPARQL query that keeps track of which metadata have been modified.The size of the zipped archive is 20 GB, while the size of the unzipped CSV files is 454 GB.
Facebook
TwitterThis dataset originates from the Data Citation Corpus V4.1: https://zenodo.org/records/16901115
To recreate this dataset, first download the csv format files from Corpus V4.1: https://zenodo.org/records/16901115
Then run this:
import glob
import pandas as pd
# Read all CSV files from the folder
# Make sure to have the correct folder where your csv files have been unzipped
csv_files = glob.glob('2025-08-15-data-citation-corpus-v4.1/*.csv')
# Read and combine all CSV files
dataframes = []
for file in csv_files:
df = pd.read_csv(file)
dataframes.append(df)
df_mdc_combined = pd.concat(dataframes, ignore_index=True).drop_duplicates()
# To save space, drop unnecessary columns
df_mdc_combined = df_mdc_combined.drop(columns=['id', 'subjects', 'affiliations', 'affiliationsROR', 'funders', 'fundersROR'])
# Keep only rows where source is 'datacite' or 'eupmc', we don't need others
df_mdc_combined = df_mdc_combined[df_mdc_combined.source.isin(['datacite','eupmc'])].copy()
# Remove https://doi.org/ from publication
# Replace / with _ in publication
df_mdc_combined['publication'] = df_mdc_combined['publication'].str.replace('https://doi.org/', '', regex=False)
df_mdc_combined['publication'] = df_mdc_combined['publication'].str.replace('/', '_', regex=False)
df_mdc_combined.to_parquet('data_citation_corpus_filtered_v4.1.parquet', index=False)
References: DataCite, & Make Data Count. (2025). Data Citation Corpus Data File (v4.1) [Data set]. DataCite. https://doi.org/10.5281/zenodo.16901115
Facebook
TwitterMeasuring the usage of informatics resources such as software tools and databases is essential to quantifying their impact, value and return on investment. We have developed a publicly available dataset of informatics resource publications and their citation network, along with an associated metric (u-Index) to measure informatics resources’ impact over time. Our dataset differentiates the context in which citations occur to distinguish between ‘awareness’ and ‘usage’, and uses a citing universe of open access publications to derive citation counts for quantifying impact. Resources with a high ratio of usage citations to awareness citations are likely to be widely used by others and have a high u-Index score. We have pre-calculated the u-Index for nearly 100,000 informatics resources. We demonstrate how the u-Index can be used to track informatics resource impact over time. The method of calculating the u-Index metric, the pre-computed u-Index values, and the dataset we compiled to calc...
Facebook
TwitterBig Data and Society Abstract & Indexing - ResearchHelpDesk - Big Data & Society (BD&S) is open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities and computing and their intersections with the arts and natural sciences about the implications of Big Data for societies. The Journal's key purpose is to provide a space for connecting debates about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business, and government relations, expertise, methods, concepts, and knowledge. BD&S moves beyond usual notions of Big Data and treats it as an emerging field of practice that is not defined by but generative of (sometimes) novel data qualities such as high volume and granularity and complex analytics such as data linking and mining. It thus attends to digital content generated through online and offline practices in social, commercial, scientific, and government domains. This includes, for instance, the content generated on the Internet through social media and search engines but also that which is generated in closed networks (commercial or government transactions) and open networks such as digital archives, open government, and crowdsourced data. Critically, rather than settling on a definition the Journal makes this an object of interdisciplinary inquiries and debates explored through studies of a variety of topics and themes. BD&S seeks contributions that analyze Big Data practices and/or involve empirical engagements and experiments with innovative methods while also reflecting on the consequences for how societies are represented (epistemologies), realized (ontologies) and governed (politics). Article processing charge (APC) The article processing charge (APC) for this journal is currently 1500 USD. Authors who do not have funding for open access publishing can request a waiver from the publisher, SAGE, once their Original Research Article is accepted after peer review. For all other content (Commentaries, Editorials, Demos) and Original Research Articles commissioned by the Editor, the APC will be waived. Abstract & Indexing Clarivate Analytics: Social Sciences Citation Index (SSCI) Directory of Open Access Journals (DOAJ) Google Scholar Scopus
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Data sharing is key for replication and re-use in empirical research. Scientific journals can play a central role by establishing data policies and providing technologies. In this study factors of influence for data sharing are analyzed by investigating journal data policies and author behavior in sociology. The websites of 140 journals from sociology were consulted to check their data policy. The results are compared with similar studies from political science and economics. For five selected journals with a broad variety all articles from two years are examined to see if authors really cite and share their data, and which factors are related to this. Inhaltscodierung Internetbeobachtung Content Analysis Web-based observation Journals of the 2013 Social Science Citation Index; Articles from 5 selected journals in 2012 and 2013. Full selection of the journals in the 2013 Social Science Citation Index in the category "sociology"; All articles from 5 selected journals in 2012 and 2013.
Facebook
Twitter💬Also have a look at
💡 UNIVERSITIES & Research INSTITUTIONS Rank - SCImagoIR
💡 Scientific JOURNALS Indicators & Info - SCImagoJR
☢️❓The entire dataset is obtained from public and open-access data of ScimagoJR (SCImago Journal & Country Rank)
ScimagoJR Country Rank
SCImagoJR About Us
Documents: Number of documents published during the selected year. It is usually called the country's scientific output.
Citable Documents: Selected year citable documents. Exclusively articles, reviews and conference papers are considered.
Citations: Number of citations by the documents published during the source year, --i.e. citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
Citations per Document: Average citations per document published during the source year, --i.e. citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
Self Citations: Country self-citations. Number of self-citations of all dates received by the documents published during the source year, --i.e. self-citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
H index: The h index is a country's number of articles (h) that have received at least h- citations. It quantifies both country's scientific productivity and scientific impact and it is also applicable to scientists, journals, etc.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
arXiv publications dataset with simulated citation relationshipshttps://github.com/jacekmiecznikowski/neo4index App evaluates scientific reasearch impact using author-level metrics (h-index and more)This collection contains data aquired from arXiv.org via OAI2 protocol.arXiv does not provide citations metadata so this data was pseudo-randomly simulated.We evaluated scientific reasearch impact using six popular author-level metrics:* h-index,* m quotient,* e-index,* m-index,* r-index,* ar-index.Sourcehttps://arxiv.org/help/bulk_data (downloaded: 2018-03-23; over 1.3 million publications)Files* arxiv_bulk_metadata_2018-03-23.tar.gz - file downloaded using oai-harvester contains metadata of all arXiv publications to date.* categories.csv - file contains categories from arXiv with category-subcategory division* publications.csv - file contains information about articles like: id, title, abstract, url, categories and date* authors.csv - file contains authors data like first name, last name and id of published article* citations.csv - file contains simulated relationships between all publications using arxivCite* indices.csv - file contains 6 author-level metrics calculated on database using neo4indexStatisticsh-index Average = 3.5836524733724495m quotient Average = 0.5831426366846965e-index Average = 7.9260187734579075m-index Average = 29.436844659143155r-index Average = 8.931101630575293ar-index Average = 3.5439082808721025h-index Median = 1.0m quotient Median = 0.4167e-index Median = 5.3852m-index Median = 17.0r-index Median = 5.831ar-index Median = 2.7928h-index Mode = 1.0m quotient Mode = 1.0e-index Mode = 0.0m-index Mode = 0.0r-index Mode = 0.0ar-index Mode = 0.0
Facebook
Twitterhttps://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do
The KCI Journal Information data provides key information on domestic academic journals registered in the Korea Citation Index (KCI) system. It includes detailed information such as electronic and paper International Standard Serial Numbers (ISSNs), journal titles, indexing categories, research areas, year of publication, publication cycle, language, issuing institutions, affiliated research institutes, affiliated universities, and institutional classification. This data can be used for a variety of research and practical purposes, including assessing the current status of academic journals, comparing journals by field of study, analyzing the academic activities of researchers and institutions, and selecting journals. Updated annually, the most recent information is available based on the revision date.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary material to an analysis on data citation practices based on the Data Citation Index from Thomson Reuters. This database launched in 2012 aims to link data sets and data studies with citation received from the rest of their citation indexes. Funding bodies and research organizations are increasingly demanding the need of researchers to make their scientific data available in a reusable and reproducible manner, aiming to maximize the allocation of funding while providing transparency on the scientific process. The DCI harvests citations to research data from papers indexed in the Web of Knowledge. It relies on the information provided by the data repository as data citation practices are inconsistent or inexistent in many cases. The findings of this study show that data citation practices are far from common in most research fields.. Some differences have been reported on the way researchers cite data: while in the areas of Science and Engineering & Technology data sets were the most cited, in Social Sciences and Arts & Humanities data studies play a greater role. 88.1% of the records have received no citation, but some repositories show very low uncitedness rates. While data citation practices are rare in most fields, they have expanded in disciplines such as Crystallography or Genomics. We conclude by emphasizing the role the DCI may play to encourage consistent and standardized citation of research data which will allow considering its use on following the research process developed by researchers, from data collection to publication.