Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
An academic journal or research journal is a periodical publication in which research articles relating to a particular academic discipline is published, according to Wikipedia. Currently, there are more than 25,000 peer-reviewed journals that are indexed in citation index databases such as Scopus and Web of Science. These indexes are ranked on the basis of various metrics such as CiteScore, H-index, etc. The metrics are calculated from yearly citation data of the journal. A lot of efforts are given to make a metric that reflects the journal's quality.
This is a comprehensive dataset on the academic journals coving their metadata information as well as citation, metrics, and ranking information. Detailed data on their subject area is also given in this dataset. The dataset is collected from the following indexing databases: - Scimago Journal Ranking - Scopus - Web of Science Master Journal List
The data is collected by scraping and then it was cleaned, details of which can be found in HERE.
Rest of the features provide further details on the journal's subject area or category: - Life Sciences: Top level subject area. - Social Sciences: Top level subject area. - Physical Sciences: Top level subject area. - Health Sciences: Top level subject area. - 1000 General: ASJC main category. - 1100 Agricultural and Biological Sciences: ASJC main category. - 1200 Arts and Humanities: ASJC main category. - 1300 Biochemistry, Genetics and Molecular Biology: ASJC main category. - 1400 Business, Management and Accounting: ASJC main category. - 1500 Chemical Engineering: ASJC main category. - 1600 Chemistry: ASJC main category. - 1700 Computer Science: ASJC main category. - 1800 Decision Sciences: ASJC main category. - 1900 Earth and Planetary Sciences: ASJC main category. - 2000 Economics, Econometrics and Finance: ASJC main category. - 2100 Energy: ASJC main category. - 2200 Engineering: ASJC main category. - 2300 Environmental Science: ASJC main category. - 2400 Immunology and Microbiology: ASJC main category. - 2500 Materials Science: ASJC main category. - 2600 Mathematics: ASJC main category. - 2700 Medicine: ASJC main category. - 2800 Neuroscience: ASJC main category. - 2900 Nursing: ASJC main category. - 3000 Pharmacology, Toxicology and Pharmaceutics: ASJC main category. - 3100 Physics and Astronomy: ASJC main category. - 3200 Psychology: ASJC main category. - 3300 Social Sciences: ASJC main category. - 3400 Veterinary: ASJC main category. - 3500 Dentistry: ASJC main category. - 3600 Health Professions: ASJC main category.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Last Version: 4
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/12/15
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 4th version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.
Version: 3
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/10/28
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 3rd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).
Erratum - Data articles in journals Version 3:
Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2
Data -- ISSN 2306-5729 -- JCR (JIF) n/a
Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a
Version: 2
Author: Francisco Rubio, Universitat Politècnia de València.
Date of data collection: 2020/06/23
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 2nd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)
Total size: 32 KB
Version 1: Description
This dataset contains a list of journals that publish data articles, code, software articles and database articles.
The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals.
Acknowledgements:
Xaquín Lores Torres for his invaluable help in preparing this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction: These datasets contain information about journals in the eight regions of the world based on United Nations SDG classification (Central& Southern Asia, Europe, Eastern &South Eastern Asia, Latin America, North Africa& Western Asia, Oceania, North America and Sub-Saharan Africa) that are indexed in Web of Science/Scopus and are available in Ulrich periodical directory. The datasets were created by matching Ulrich journal information with journal information from Web of Science and Scopus.
Data Creation: A single Web of Science master journal list was created for SSCI, SCI, AHCI and ESCI by combining and removing duplicate records from their lists; the Web of Science master journal contained 21,908 unique journals. Only active scholarly journals from Scopus were included in this study; i.e. duplicates, all inactive sources, trade journals, book series, monographs and conference proceedings were removed. 26,029 active journals of the 43,013 sources in Scopus were included. Journal lists from 239 countries were collected from Ulrich comprehensive periodical directory and analyzed by region. After removal of duplicates, this generated a database of 83,429 unique active academic journals. To compile regional and global datasets, duplicate journals in the regional and global levels, respectively, were removed. The master journal lists created from Web of Science, Scopus and Ulrich were transferred to an SQL database for querying. Journal matching was carried out in two steps. Firstly, the ISSN numbers of journals in Web of Science and Scopus were used to match journal records to Ulrich. In the second step, the remaining journals were then matched using their titles, and these matches were manually verified to reduce the chances of false positives. Using these two steps, we were able to match 20,255 (92.46%) of the journals in Web of Science, and 23,349 (89.70%) of the academic journals from Scopus, with Ulrichsweb journal list.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is a list of all the journal abbreviations from Web of Science. It is not a perfect list, not least because of the numerous errors in the Web of Science list. However, it was quite a fast way of getting most of the nearly 90 000 journal titles and abbreviations into jabref, and could be useful for other bibliographic systems and/or doing it manually. This was created using R (the only "programming language" i know), extracting the abbreviations from the web of science lists (https://images.webofknowledge.com/WOKRS520B4.1/help/WOS/A_abrvjt.html). Feel free to help with improvements!Files:wos_abbrev_table.csv - Table with full names and abbreviations, with and without dots in abbreviations.jabref_wos_abbrev.txt - Abbreviation table in Jabref formatjabref_wos_abbrev_dots.txt - Abbreviation table in Jabref format, with dots.wos_abbrev_code.R - R code used to create the list. Thanks to Daniel Graeber (dgr@bios.au.dk) for inspiration and guidance regarding the addition of dots to abbreviated journal names.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data about the share of English Language documents in previous two years and value of Impact Factor in 3th year from 2006 to 2016 years. The data was prepared for article: Non-English language publications in Citation Indexes – quantity and quality / Olga Moskaleva and Mark Akoev // 17th International Conference on Scientometrics and Informetrics ISSI 2019 and 24rd International Conference on Science, Technology and Innovation STI 2019
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given and data on retracted papers (based on Retraction Watch database) as well as citations to/from retracted papers have been added. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2024 and single recent year data pertain to citations received during calendar year 2024. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (7) is based on the August 1, 2025 snapshot from Scopus, updated to end of citation year 2024. This work uses Scopus data. Calculations were performed using all Scopus author profiles as of August 1, 2025. If an author is not on the list, it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work. PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases. The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, see attached file on FREQUENTLY ASKED QUESTIONS. Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A summary of studies associated with research supporting better sport running. The Web of Science was used to populate this list of studies identified using key concepts of running performance and meta-analysis or systematic reviews.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contains list of Records obtained from the Search.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data enclosed in a single zipped folder:
A) DASH-V2 : Data files for final published analysis (J. Informetrics, 2019)
File A1: PubData_DOI_141986_Nc_0_2019.dta
File A2: PubData_DOI_141986_Nc_0_2019_DOFILE
B) DASH-V1 : Data files for preprint version (https://ssrn.com/abstract=2901272)
File B1: PubData_Obs_102741_Nc_10_No2015_CitationsAnalysis.dta
File B2: PubData_Obs_128734_Nc_10_AcceptanceTimeAnalysis.dta
File B3: STATA13_DOFILE
C) Data description common to all .dta files, which contain parsed and merged PLOS ONE and Web of Science metadata:
File A3: UC-DASH_DataDescription_Petersen_V2.pdf
File B4: UC-DASH_DataDescription_Petersen_V1.pdf
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research data to accommodate the article "Overlay journals: a study of the current landscape" (https://doi.org/10.1177/09610006221125208)
Identifying the sample of overlay journals was an explorative process (occurring during April 2021 to February 2022). The sample of investigated overlay journals were identified by using the websites of Episciences.org (2021), Scholastica (2021), Free Journal Network (2021), Open Journals (2021), PubPub (2022), and Wikipedia (2021). In total, this study identified 34 overlay journals. Please see the paper for more details about the excluded journal types.
The journal ISSN numbers, manuscript source repositories, first overlay volumes, article volumes, publication languages, peer-review type, licence for published articles, author costs, publisher types, submission policy, and preprint availability policy were observed by inspecting journal editorial policies and submission guidelines found from journal websites. The overlay journals’ ISSN numbers were identified by examining journal websites and cross-checking this information with the Ulrich’s periodicals database (Ulrichsweb, 2021). Journals that published review reports, either with reviewers’ names or anonymously, were classified as operating with open peer-review. Publisher types defined by Laakso and Björk (2013) were used to categorise the findings concerning the publishers. If the journal website did not include publisher information, the editorial board was interpreted to publish the journal.
The Organisation for Economic Co-operation and Development (OECD) field of science classification was used to categorise the journals into different domains of science. The journals’ primary OECD field of sciences were defined by the authors through examining the journal websites.
Whether the journals were indexed in the Directory of Open Access Journals (DOAJ), Scopus, or Clarivate Analytics’ Web of Science Core collection’s journal master list was examined by searching the services with journal ISSN numbers and journal titles.
The identified overlay journals were examined from the viewpoint of both qualitative and quantitative journal metrics. The qualitative metrics comprised the Nordic expert panel rankings of scientific journals, namely the Finnish Publication Forum, the Danish Bibliometric Research Indicator and the Norwegian Register for Scientific Journals, Series and Publishers. Searches were conducted from the web portals of the above services with both ISSN numbers and journal titles. Clarivate Analytics’ Journal Citation Reports database was searched with the use of both ISSN numbers and journal titles to identify whether the journals had a Journal Citation Indicator (JCI), Two-Year Impact Factor (IF) and an Impact Factor ranking (IF rank). The examined Journal Impact Factors and Impact Factor rankings were for the year 2020 (as released in 2021).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The following shortlist of diamond open-access journals was compiled to increase awareness of alternative scholarly publication models among the six departments of the Faculty of Science at Utrecht University. The list is relevant to the six disciplines at the Faculty of Science: Biology, Chemistry, Mathematics, Information and Computing Sciences, Physics, and Pharmaceutical Sciences. For this purpose, a "diamond journal" is defined as a journal indexed in the Directory of Open Access Journals (DOAJ) that does not charge an article processing charge (APC).
Contents and Results
The Excel file titled “Diamond_journals_faculty_of_science_UU” contains the list of selected diamond journals based on the following criteria: they allow submissions in English, have a plagiarism screening policy, possess an electronic ISSN number, and accept submissions in Biology, Chemistry, Mathematics, Information and Computing Sciences, Physics, and Pharmaceutical Sciences. In this shortlist, 355 journals meet the criteria. Out of these 355 journals, only 29 have received a DOAJ seal, 150 journals are indexed in Scopus, and 94 journals are indexed in Web of Science.
A detailed description of the methods employed to obtain this shortlist can be found in the Word file titled "Methods_and_Results".
The raw CSV data has been included under the name "Raw_DOAJ_journal_metadata_2023_07_25".
Limitations
The compilers of this shortlist are aware that some current diamond journals could change their status to non-diamond by charging article processing fees at a later stage. Since the journal record is not always updated by the publishers, we strongly recommend the users double-check the latest open access status directly on the journal's homepage (journal URLs are provided in the Excel file). The same applies for Scopus and WOS indexations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw list of articles retrieved from three databases search (PubMed, WOS, CAB Abstract) using keywords for rice pests and diseases, prior to duplicates removal and data curation. The excel is divided in 3 worksheets corresponding to each one of the 3 databases.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
List of Top Institutions of International Journal of Web Science sorted by citations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Clarivate Analytics, managers of Web of Science, publishes an annual listing of highly cited researchers. The opening sentence of the 2019 report asks, "Who would contest that in the race for knowledge it is human capital that is most essential?". They state that "talent - including intelligence, creativity, ambition, and social competence (where needed) - outpaces other capacities such as access to funding and facilities". This contradicts the findings of Sinay et al. (2019), who found that the algorithm used by search engines, including the Web of Science, is possibly more influential than human capital. Using Clarivate Analytics' database for 2018, we investigated which factors are most relevant in the impact race. Rather than human capital alone, we found that language, gender, funding and facilities introduce bias to assessments and possibly prevent talent and discoveries from emerging. We found that the profile of the highly cited scholars is so narrow that it may compromise the validity of scientific knowledge, because it is biased towards the perception and interests of male scholars affiliated with very-highly-developed countries where English is commonly spoken. These scholars accounted for 80 percent of the random sample analyzed; absent were women from Latin-America, Africa, Asia and Oceania; and scholars affiliated with institutions in low-human-development countries. Ninety-eight percent of the published research came from institutions in very-highly-developed countries. Providing evidence that challenges the view that 'talent is the primary driver of scientific advancement' is important because search engines, such as the Web of Science, can modify their algorithms. This would ensure the work of scholars that do not fit the currently dominant profile can have their importance elevated so that their findings can more equitably contribute to knowledge development. This, in turn, will increase the validity of scientific enquiry. Data was collected from Clarivate Analytics
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Journal lists of all the 46 Sub-Saharan African countries were retrieved manually from Ulrich periodical database using the "country of publication" field in the advanced search interface. Delimiters were used to limit the retrieved results to periodicals in the journal categories and with active status. Ulrich's database usually multiple records for the different formats (eg. online and print), or languages in which a single journal is published. Duplicates were removed from the retrieved results.
Master journal lists for Web of Science indexes comprising of the Science Citation Index Expanded (SCIE), the Social Science Citation Index (SSCI) and the Arts and Humanities Citation Index (A&HCI) and Emerging Sources Citation Index ESCI. Master journal lists for Scopus, EMBASE and MEDLINE databases were downloaded from their respective publishers' websites. Master journal lists for AJOL was not available on the publishers' website. Therefore, the master journal list from AJOL was created manually by extracting journal information from the publishers' websites. Only active journals were included in the study, where active journals were defined as journals that have published at least an issue in 2021 or 2020. The master journal list for AIM was not available as well. The whole database comprising of 18,949 articles were downloaded with the source (journal names). Journals were sorted to identify unique journal names, where only 15,279 articles had identifiable journal names. Five hundred twenty-four unique journals were identified, with only 74 active journals. Journals that were not indexed in the AIM database in 2020 or 2021 were deemed inactive and were not included in the study. This study was not considered for ethics review because data used was collected from publicly available records.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Predatory journals are entities within academia which publish work without high quality peer review, and falsely promise widespread reach, indexing in reputable databases, rapid publication times, low or no article processing charges and aggressive solicitation of legitimate researchers for submission. There is currently no freely available database of individual journal titles which are thought to be potentially predatory. This dataset comprises a list of such journals generated from Beall’s list of potentially predatory publishers. This dataset of 42,896 journals may be used to guide researchers across all fields, but particularly within biomedical academia when considering submission of their work for publication and when evaluating unsolicited email contact from journals. Dataset generated by web scraping of list of publishers from Beall's list of predatory publishers, validated against the National Library of Medicine, Directory of Open Access Journals, Web of Science, Scopus and Embase bibliographic registries.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
List of Top Schools of International Journal of Web Science sorted by citations.
Facebook
TwitterDataset Description:
The data is partitioned according to a 75/15/15 train/test/validate split. Each entry has an abstract (which is the input text for classification), a domain (a label from the list below), and an area (a subdomain of the paper, such as CS -> computer graphics, which takes on one of 134 possible values). All the attributes are strings. Domain labels: - Computer Science - Electrical Engineering - Psychology - Mechanical Engineering, - Civil Engineering - Medical… See the full description on the dataset page: https://huggingface.co/datasets/river-martin/web-of-science-with-label-texts.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A comprehensive list of ecology and evolution journals including citations, impact factor, and eigenfactor scores. Data derived from the 2018 Journal Citation Reports bibliometric tools.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The LSC (Leicester Scientific Corpus)August 2019 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk) Supervised by Prof Alexander Gorban and Dr Evgeny MirkesThe data is extracted from the Web of Science® [1] You may not copy or distribute this data in whole or in part without the written consent of Clarivate Analytics.Getting StartedThis text provides background information on the LSC (Leicester Scientific Corpus) and pre-processing steps on abstracts, and describes the structure of files to organise the corpus. This corpus is created to be used in future work on the quantification of the sense of research texts. One of the goal of publishing the data is to make it available for further analysis and use in Natural Language Processing projects.LSC is a collection of abstracts of articles and proceeding papers published in 2014, and indexed by the Web of Science (WoS) database [1]. Each document contains title, list of authors, list of categories, list of research areas, and times cited. The corpus contains only documents in English.The corpus was collected in July 2018 online and contains the number of citations from publication date to July 2018.Each document in the corpus contains the following parts:1. Authors: The list of authors of the paper2. Title: The title of the paper3. Abstract: The abstract of the paper4. Categories: One or more category from the list of categories [2]. Full list of categories is presented in file ‘List_of _Categories.txt’.5. Research Areas: One or more research area from the list of research areas [3]. Full list of research areas is presented in file ‘List_of_Research_Areas.txt’.6. Total Times cited: The number of times the paper was cited by other items from all databases within Web of Science platform [4]7. Times cited in Core Collection: The total number of times the paper was cited by other papers within the WoS Core Collection [4]We describe a document as the collection of information (about a paper) listed above. The total number of documents in LSC is 1,673,824.All documents in LSC have nonempty abstract, title, categories, research areas and times cited in WoS databases. There are 119 documents with empty authors list, we did not exclude these documents.Data ProcessingThis section describes all steps in order for the LSC to be collected, clean and available to researchers. Processing the data consists of six main steps:Step 1: Downloading of the Data OnlineThis is the step of collecting the dataset online. This is done manually by exporting documents as Tab-delimitated files. All downloaded documents are available online.Step 2: Importing the Dataset to RThis is the process of converting the collection to RData format for processing the data. The LSC was collected as TXT files. All documents are extracted to R.Step 3: Cleaning the Data from Documents with Empty Abstract or without CategoryNot all papers have abstract and categories in the collection. As our research is based on the analysis of abstracts and categories, preliminary detecting and removing inaccurate documents were performed. All documents with empty abstracts and documents without categories are removed.Step 4: Identification and Correction of Concatenate Words in AbstractsTraditionally, abstracts are written in a format of executive summary with one paragraph of continuous writing, which is known as ‘unstructured abstract’. However, especially medicine-related publications use ‘structured abstracts’. Such type of abstracts are divided into sections with distinct headings such as introduction, aim, objective, method, result, conclusion etc.Used tool for extracting abstracts leads concatenate words of section headings with the first word of the section. As a result, some of structured abstracts in the LSC require additional process of correction to split such concatenate words. For instance, we observe words such as ConclusionHigher and ConclusionsRT etc. in the corpus. The detection and identification of concatenate words cannot be totally automated. Human intervention is needed in the identification of possible headings of sections. We note that we only consider concatenate words in headings of sections as it is not possible to detect all concatenate words without deep knowledge of research areas. Identification of such words is done by sampling of medicine-related publications. The section headings in such abstracts are listed in the List 1.List 1 Headings of sections identified in structured abstractsBackground Method(s) DesignTheoretical Measurement(s) LocationAim(s) Methodology ProcessAbstract Population ApproachObjective(s) Purpose(s) Subject(s)Introduction Implication(s) Patient(s)Procedure(s) Hypothesis Measure(s)Setting(s) Limitation(s) DiscussionConclusion(s) Result(s) Finding(s)Material (s) Rationale(s)Implications for health and nursing policyAll words including headings in the List 1 are detected in entire corpus, and then words are split into two words. For instance, the word ‘ConclusionHigher’ is split into ‘Conclusion’ and ‘Higher’.Step 5: Extracting (Sub-setting) the Data Based on Lengths of AbstractsAfter correction of concatenate words is completed, the lengths of abstracts are calculated. ‘Length’ indicates the totalnumber of words in the text, calculated by the same rule as for Microsoft Word ‘word count’ [5].According to APA style manual [6], an abstract should contain between 150 to 250 words. However, word limits vary from journal to journal. For instance, Journal of Vascular Surgery recommends that ‘Clinical and basic research studies must include a structured abstract of 400 words or less’[7].In LSC, the length of abstracts varies from 1 to 3805. We decided to limit length of abstracts from 30 to 500 words in order to study documents with abstracts of typical length ranges and to avoid the effect of the length to the analysis. Documents containing less than 30 and more than 500 words in abstracts are removed.Step 6: Saving the Dataset into CSV FormatCorrected and extracted documents are saved into 36 CSV files. The structure of files are described in the following section.The Structure of Fields in CSV FilesIn CSV files, the information is organised with one record on each line and parts of abstract, title, list of authors, list of categories, list of research areas, and times cited is recorded in separated fields.To access the LSC for research purposes, please email to ns433@le.ac.uk.References[1]Web of Science. (15 July). Available: https://apps.webofknowledge.com/[2]WoS Subject Categories. Available: https://images.webofknowledge.com/WOKRS56B5/help/WOS/hp_subject_category_terms_tasca.html[3]Research Areas in WoS. Available: https://images.webofknowledge.com/images/help/WOS/hp_research_areas_easca.html[4]Times Cited in WoS Core Collection. (15 July). Available: https://support.clarivate.com/ScientificandAcademicResearch/s/article/Web-of-Science-Times-Cited-accessibility-and-variation?language=en_US[5]Word Count. Available: https://support.office.com/en-us/article/show-word-count-3c9e6a11-a04d-43b4-977c-563a0e0d5da3[6]A. P. Association, Publication manual. American Psychological Association Washington, DC, 1983.[7]P. Gloviczki and P. F. Lawrence, "Information for authors," Journal of Vascular Surgery, vol. 65, no. 1, pp. A16-A22, 2017.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
An academic journal or research journal is a periodical publication in which research articles relating to a particular academic discipline is published, according to Wikipedia. Currently, there are more than 25,000 peer-reviewed journals that are indexed in citation index databases such as Scopus and Web of Science. These indexes are ranked on the basis of various metrics such as CiteScore, H-index, etc. The metrics are calculated from yearly citation data of the journal. A lot of efforts are given to make a metric that reflects the journal's quality.
This is a comprehensive dataset on the academic journals coving their metadata information as well as citation, metrics, and ranking information. Detailed data on their subject area is also given in this dataset. The dataset is collected from the following indexing databases: - Scimago Journal Ranking - Scopus - Web of Science Master Journal List
The data is collected by scraping and then it was cleaned, details of which can be found in HERE.
Rest of the features provide further details on the journal's subject area or category: - Life Sciences: Top level subject area. - Social Sciences: Top level subject area. - Physical Sciences: Top level subject area. - Health Sciences: Top level subject area. - 1000 General: ASJC main category. - 1100 Agricultural and Biological Sciences: ASJC main category. - 1200 Arts and Humanities: ASJC main category. - 1300 Biochemistry, Genetics and Molecular Biology: ASJC main category. - 1400 Business, Management and Accounting: ASJC main category. - 1500 Chemical Engineering: ASJC main category. - 1600 Chemistry: ASJC main category. - 1700 Computer Science: ASJC main category. - 1800 Decision Sciences: ASJC main category. - 1900 Earth and Planetary Sciences: ASJC main category. - 2000 Economics, Econometrics and Finance: ASJC main category. - 2100 Energy: ASJC main category. - 2200 Engineering: ASJC main category. - 2300 Environmental Science: ASJC main category. - 2400 Immunology and Microbiology: ASJC main category. - 2500 Materials Science: ASJC main category. - 2600 Mathematics: ASJC main category. - 2700 Medicine: ASJC main category. - 2800 Neuroscience: ASJC main category. - 2900 Nursing: ASJC main category. - 3000 Pharmacology, Toxicology and Pharmaceutics: ASJC main category. - 3100 Physics and Astronomy: ASJC main category. - 3200 Psychology: ASJC main category. - 3300 Social Sciences: ASJC main category. - 3400 Veterinary: ASJC main category. - 3500 Dentistry: ASJC main category. - 3600 Health Professions: ASJC main category.