Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2022 and single recent year data pertain to citations received during calendar year 2022. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (6) is based on the October 1, 2023 snapshot from Scopus, updated to end of citation year 2022. This work uses Scopus data provided by Elsevier through ICSR Lab (https://www.elsevier.com/icsr/icsrlab). Calculations were performed using all Scopus author profiles as of October 1, 2023. If an author is not on the list it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work.
PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases.
The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, please read the 3 associated PLoS Biology papers that explain the development, validation and use of these metrics and databases. (https://doi.org/10.1371/journal.pbio.1002501, https://doi.org/10.1371/journal.pbio.3000384 and https://doi.org/10.1371/journal.pbio.3000918).
Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a
Facebook
TwitterOne of the lines of research of Sustaining the Knowledge Commons (SKC) is a longitudinal study of the minority (about a third) of the fully open access journals that use this business model. The original idea was to gather data during an annual two-week census period. The volume of data and growth in this area makes this an impractical goal. For this reason, we are posting this preliminary dataset in case it might be helpful to others working in this area. Future data gathering and analysis will be conducted on an ongoing basis. Major sources of data for this dataset include: • the Directory of Open Access Journals (DOAJ) downloadable metadata; the base set is from May 2014, with some additional data from the 2015 dataset • data on publisher article processing charges and related information gathered from publisher websites by the SKC team in 2015, 2014 (Morris on, Salhab, Calvé-Genest & Horava, 2015) and a 2013 pilot • DOAJ article content data screen scraped from DOAJ (caution; this data can be quite misleading due to limitations with article-level metadata) • Subject analysis based on DOAJ subject metadata in 2014 for selected journals • Data on APCs gathered in 2010 by Solomon and Björk (supplied by the authors). Note that Solomon and Björk use a different method of calculating APC so the numbers are not directly comparable. • Note that this full d ataset includes some working columns which are meaningful only by means of explaining very specific calculations which are not necessarily evident in the dataset per se. Details below. Significant limitation: • This dataset does not include new journals added to DOAJ in 2015. A recent publisher size analysis indicates some significant changes. For example, DeGruyter, not listed in the 2014 survey, is now the third largest DOAJ publisher with over 200 titles. Elsevier is now the 7th largest DOAJ publisher. In both cases, gathering data from the publisher websites will be time-consuming as it is necessary to conduct individual title look-up. • Some OA APC data for newly added journals was gathered in May 2015 but has not yet been added to this dataset. One of the reasons for gathering this data is a comparison of the DOAJ "one price listed" approach with potentially richer data on the publisher's own website. For full details see the documentation.
Facebook
TwitterData for the 9 figures contained in the paper, A SOFTWARE FRAMEWORK FOR ASSESSING THE RESILIENCE OF DRINKING WATER SYSTEMS TO DISASTERS WITH AN EXAMPLE EARTHQUAKE CASE STUDY. This dataset is associated with the following publication: Klise, K., M. Bynum, D. Moriarty, and R. Murray. A SOFTWARE FRAMEWORK FOR ASSESSING THE RESILIENCE OF DRINKING WATER SYSTEMS TO DISASTERS WITH AN EXAMPLE EARTHQUAKE CASE STUDY. ENVIRONMENTAL MODELLING & SOFTWARE. Elsevier Science, New York, NY, 95: 420-431, (2017).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To generate the bibliographic and survey data to support a data reuse study conducted by several Library faculty and accepted for publication in the Journal of Academic Librarianship, the project team utilized a series of web-based online scripts that employed several different endpoints from the Scopus API. The related dataset: "Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University" contains survey design and results.
1) getScopus_API_process_dmp_IDB.asp: used the search API query the Scopus database API for papers by UIUC authors published in 2015 -- limited to one of 9 pre-defined Scopus subject areas -- and retrieve metadata results sorted highest to lowest by the number of times the retrieved articles were cited. The URL for the basic searches took the following form: https://api.elsevier.com/content/search/scopus?query=(AFFIL%28(urbana%20OR%20champaign) AND univ*%29) OR (AF-ID(60000745) OR AF-ID(60005290))&apikey=xxxxxx&start=" & nstart & "&count=25&date=2015&view=COMPLETE&sort=citedby-count&subj=PHYS
Here, the variable nstart was incremented by 25 each iteration and 25 records were retrieved in each pass. The subject area was renamed (e.g. from PHYS to COMP for computer science) in each of the 9 runs. This script does not use the Scopus API cursor but downloads 25 records at a time for up to 28 times -- or 675 maximum bibliographic records. The project team felt that looking at the most 675 cited articles from UIUC faculty in each of the 9 subject areas was sufficient to gather a robust, representative sample of articles from 2015. These downloaded records were stored in a temporary table that was renamed for each of the 9 subject areas.
2) get_citing_from_surveys_IDB.asp: takes a Scopus article ID (eid) from the 49 UIUC author returned surveys and retrieves short citing article references, 200 at a time, into a temporary composite table. These citing records contain only one author, no author affiliations, and no author email addresses. This script uses the Scopus API cursor=* feature and is able to download all the citing references of an article 200 records at a time.
3) put_in_all_authors_affil_IDB.asp: adds important data to the short citing records. The script adds all co-authors and their affiliations, the corresponding author, and author email addresses.
4) process_for_final_IDB.asp: creates a relational database table with author, title, and source journal information for each of the citing articles that can be copied as an Excel file for processing by the Qualtrics survey software. This was initially 4,626 citing articles over the 49 UIUC authored articles, but was reduced to 2,041 entries after checking for available email addresses and eliminating duplicates.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We are pleased to share the dataset SEED-PQD-v1 (SEED Power Quality Distrubance Dataset v1) used in our study titled "XPQRS: Expert power quality recognition system for sensitive load applications," published in Elsevier Journal Measurement. This dataset is invaluable for researchers and practitioners in the field of power quality analysis, especially those focusing on sensitive load applications. This dataset can be used in Python as well as in MATLAB.
Access the published paper:
https://www.sciencedirect.com/science/article/abs/pii/S0263224123004530
Dataset Details:
Fundamental Frequency: 50 Hz
Sampling Rate: 5 kHz
Number of Classes: 17
Signals per Class: 1000
Length of Each Signal (samples): 100
Length of Each Signal (time): 20 ms
Amplitude of Each Signal: Scaled between -1 to 1
Data Format:
The dataset is available in two formats: MATLAB and CSV.
MATLAB File:
Filename: 5Kfs_1Cycle_50f_1000Sam_1A.mat
Structure: A matrix of dimensions (1000 x 100 x 17), where:
1000 = Signals per class
100 = Samples per signal
17 = Number of classes
Class Order:
Pure_Sinusoidal
Sag
Swell
Interruption
Transient
Oscillatory_Transient
Harmonics
Harmonics_with_Sag
Harmonics_with_Swell
Flicker
Flicker_with_Sag
Flicker_with_Swell
Sag_with_Oscillatory_Transient
Swell_with_Oscillatory_Transient
Sag_with_Harmonics
Swell_with_Harmonics
Notch
CSV Files:
Files: 17 CSV files, one for each class.
Structure: Each CSV file has dimensions (1000 x 100), where:
1000 = Signals per class
100 = Samples per signal
Usage:
This dataset is designed to support the development and testing of power quality recognition systems. The 17 classes cover a broad range of power quality disturbances, providing a comprehensive resource for training machine learning models and validating their performance in recognizing various types of power quality issues.
Acknowledgements:
All users of the dataset are advised to cite the following article:
Citation: Muhammad Umar Khan, Sumair Aziz, Adil Usman, XPQRS: Expert power quality recognition system for sensitive load applications, Measurement, Volume 216, 2023, 112889, ISSN 0263-2241, https://doi.org/10.1016/j.measurement.2023.112889. Link to the article
Thank you for your interest in our work. We hope this dataset facilitates further advancements in power quality analysis and related fields.keywords: Power Quality Recognition, Power Quality Classification, Electrical Signal Analysis, Power System Disturbances, Signal Processing, Power Quality Monitoring
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Banana cultivation is frequently challenged by various diseases that severely impact yield. These diseases detrimentally affect banana plants, causing growth inhibition, diminished fruit production, and even plant fatality. The consequences are far-reaching, as afflicted plants struggle to yield marketable fruit, leading to financial setbacks for banana growers and the potential to disrupt the global banana supply.
The dataset comprises a diverse collection of images showcasing three prominent banana leaf spot diseases, namely: 1. Sigatoka 2. Cordana 3. Pestalotiopsis Additionally, images depicting healthy banana leaves are incorporated for comprehensive analysis.
The images were captured using smartphone cameras in the banana fields of Bangabandhu Sheikh Mujibur Rahman Agricultural University, Bangladesh, and nearby banana fields in June 2021. All images were labelled by an expert plant pathologist.
The dataset is constituted of two subsets. a) Original Set: This comprises 937 RGB images, divided into 4 classes and provided in JPG format. b) Augmented Set: This set supplements the original collection with 400 images per class, culminating in a total of 1600 images. Employing augmentation techniques, such as Gaussian blur, horizontal flip, cropping, linear contrast adjustment, shear, translation, and rotational shear, we enhanced the dataset's diversity. All images have a standard resolution of 224 x 224 pixels.
Please consider reading the following research articles based on this dataset: 1. BananaSqueezeNet: A very fast, lightweight convolutional neural network for the diagnosis of three prominent banana leaf diseases 2. Bananalsd: A Banana Leaf Images Dataset for Classification of Banana Leaf Diseases Using Machine Learning
If you're using this dataset for your work, please cite the following articles:
@article{bhuiyan2023bananasqueezenet,
title={BananaSqueezeNet: A very fast, lightweight convolutional neural network for the diagnosis of three prominent banana leaf diseases},
author={Bhuiyan, Md Abdullahil Baki and Abdullah, Hasan Muhammad and Arman, Shifat E and Rahman, Sayed Saminur and Al Mahmud, Kaies},
journal={Smart Agricultural Technology},
volume={4},
pages={100214},
year={2023},
publisher={Elsevier}
}
@article{arman2023bananalsd,
title={BananaLSD: A banana leaf images dataset for classification of banana leaf diseases using machine learning},
author={Arman, Shifat E and Bhuiyan, Md Abdullahil Baki and Abdullah, Hasan Muhammad and Islam, Shariful and Chowdhury, Tahsin Tanha and Hossain, Md Arban},
journal={Data in Brief},
pages={109608},
year={2023},
publisher={Elsevier}
}
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
RibFrac dataset is a benchmark for developping algorithms on rib fracture detection, segmentation and classification. We hope this large-scale dataset could facilitate both clinical research for automatic rib fracture detection and diagnoses, and engineering research for 3D detection, segmentation and classification.
Due to size limit of zenodo.org, we split the whole RibFrac Training Set into 2 parts; This is the Training Set Part 2 of RibFrac dataset, including 120 CTs and the corresponding annotations. Files include:
ribfrac-train-images-2.zip: 120 chest-abdomen CTs in NII format (nii.gz).
ribfrac-train-labels-2.zip: 120 annotations in NII format (nii.gz).
ribfrac-train-info-2.csv: labels in the annotation NIIs.
public_id: anonymous patient ID to match images and annotations.
label_id: discrete label value in the NII annotations.
label_code: 0, 1, 2, 3, 4, -1
0: it is background
1: it is a displaced rib fracture
2: it is a non-displaced rib fracture
3: it is a buckle rib fracture
4: it is a segmental rib fracture
-1: it is a rib fracture, but we could not define its type due to ambiguity, diagnosis difficulty, etc. Ignore it in the classification task.
If you find this work useful in your research, please acknowledge the RibFrac project teams in the paper and cite this project as:
Liang Jin, Jiancheng Yang, Kaiming Kuang, Bingbing Ni, Yiyi Gao, Yingli Sun, Pan Gao, Weiling Ma, Mingyu Tan, Hui Kang, Jiajun Chen, Ming Li. Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet. EBioMedicine (2020). (DOI)
or using bibtex
@article{ribfrac2020, title={Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet}, author={Jin, Liang and Yang, Jiancheng and Kuang, Kaiming and Ni, Bingbing and Gao, Yiyi and Sun, Yingli and Gao, Pan and Ma, Weiling and Tan, Mingyu and Kang, Hui and Chen, Jiajun and Li, Ming}, journal={EBioMedicine}, year={2020}, publisher={Elsevier} }
The RibFrac dataset is a research effort of thousands of hours by experienced radiologists, computer scientists and engineers. We kindly ask you to respect our effort by appropriate citation and keeping data license.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset was used in previous research projects, for example in IMPROVE.
The use case focuses on the prediction of the condition of an important component within production lines. The condition of this component is important for the function of the plant and the resulting product quality. Data for 8 run-to-failure experiments were provided and 8 features related to the component were selected. Training and prediction data were selected using the leave-one-out method: data from the component under test were selected as the target for the prediction. A set amount of data of all other components were selected and combined to serve as training data for the 'new' condition. A SOM was trained on a training data to represent the 'new' condition. The degradation of the component under test was calculated and visualized. This procedure was repeated for all 8 data sets to get a prediction of the degradation for all components. The prediction worked for all cases which were labeled with a certain type of wear by experts. Furthermore, one of the components did not show signs of wear according to experts which was also confirmed by the model.
This dataset is publicly available for anyone to use under the following terms.
von Birgelen, Alexander; Buratti, Davide; Mager, Jens; Niggemann, Oliver: Self-Organizing Maps for Anomaly Localization and Predictive Maintenance in Cyber-Physical Production Systems. In: 51st CIRP Conference on Manufacturing Systems (CIRP CMS 2018) CIRP-CMS, May 2018.
Paper available open access: https://authors.elsevier.com/sd/article/S221282711830307X
IMPROVE has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 678867
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The advantages of choosing The Lens as an abstract base for identifying relevant publication topics in a given subject area:Open access to data, possibility of simultaneous export of up to 50 thousand bibliometric records, advanced analytics.The data exported from The Lens database can be used in the most commonly utilized programs for bibliometric analysis — VOSviewer and Bibliometrix.The Lens issues — to analyze publication topics (author's interests) in this system, the Keywords and Abstract fields are poorly populated.Characteristics of The Len's base as of September 29, 2024:Scholarly Works (280,455,781) = All Docs; 134M Analytics Set; 41.5M Authors; 1.3M Source Titles; 698K Fields of Study; 353 Journal Subjects; 5.9M Keywords; 254.2K Chemicals; 30.6K MeSH Headings; 41.4M Institutions; 2.6M Funding Organizations; 4.6K Conferences.698K Fields of Study — is a very detailed classification of bibliometric data that can be used similarly to Index Keywords in Scopus and Keywords plus in WoS.The approach proposed in this study is to populate the Keywords and Abstract fields in The Lens with data taken from the bibliometric data of the respective publishers, which are usually very well populated.Taking into account the interests of the author of this paper, the data of Elsevier publisher available in the open ScienceDirect database corresponding to the query are chosen as an example: 'Title, abstract, keywords: digital energy; Article type: Review articles and Research articles; Years: 2020–2024; Subject areas: Engineering, Materials Science and Energy; Languages: English'. Data are current as of September 16, 2024.Omitting the details of data export and preprocessing, the final file used in the VOSviewer and Bibliometrix programs contained 3373 bibliometric records.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Abstract
We developed a methodology to detect the oldest scholarly reference added to Wikipedia articles by which a certain paper is uniquely identifiable as the "first appearance of the scholarly reference." We identified the first appearances of 923,894 scholarly references (611,119 unique DOIs) in180,795 unique pages on English Wikipedia as of March 1, 2017, and stored them in the dataset. Moreover, we assessed the precision of the dataset, which was and it was a high precision regardless of the research field.
Data Records
The data format of the dataset is JSON lines, where each line is a single record. In this dataset, we detected the first appearance of each scholarly reference added to Wikipedia articles. If there are multiple references corresponding to the same paper on the same page, only the oldest one is collected. Sample of the record is the following.
doi -- DOI corresponding to the paper (String), e.g., "10.1006/anbe.1996.0497"
paper_type -- Document type of the paper (String), e.g., "journal-article"
paper_container_title -- Journal title, book title, or proceedings title (Array of String), e.g., ["Animal Behaviour"]
paper_publisher -- Publisher name (String), e.g., "Elsevier BV"
paper_title -- Paper title (Array of String), e.g., ["Push or pull: an experimental study on imitation in marmosets"]
paper_published_year -- Published year (String), e.g., "1997"
paper_issue -- Issue number (String), e.g., "4"
paper_volume -- Volume number (String), e.g., "54"
paper_page -- Page numbers (String), e.g., "817-831"
paper_author -- Authors information consisted of given and family names, sequences (order in author names), and affiliations (Array of JSON), e.g., [{"given":"THOMAS", "family":"BUGNYAR", "sequence":"first", "affiliation":[]}, {"given":"LUDWIG", "family":"HUBER", "sequence":"additional", "affiliation":[]}]
issn -- ISSN related to the paper (Array of String), e.g., ["0003-3472"]
research_field -- Research fields from ESI categories (Array of String), e.g., ["PLANT & ANIMAL SCIENCE"]
page_id -- Page id (String), e.g., "577858"
page_title -- Page title (String), e.g., "Imitation"
revision_id -- Revision id (String), e.g., "203309031"
revision_timestamp -- Revision timestamp (String), e.g., "2008-04-04 15:54:09 UTC"
revision_comment -- Revision comment (edit summary) (String), e.g., "/* Animal Behaviour */"
editor_name -- Wikipedia editor's name (String), e.g., "Nicemr"
editor_type -- Type of the editor (String), e.g., "User"
References
Kikkawa, J., Takaku, M. & Yoshikane, F. Dataset of first appearances of the scholarly bibliographic references on Wikipedia articles (submitted to Scientific Data).
FUNDING
JSPS KAKENHI Grant Number JP20K12543
JSPS KAKENHI Grant Number JP21K21303
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of Covid-19 publications by Highlighting of Sex and Gender-Specific Health (SGSH) content and citescore.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset contains the machine data of a degrading component recorded over the duration of 12 month total. It was initiated in the European research and innovation project IMPROVE.
The Vega shrink-wrapper from OCME is deployed in large production lines in the food and beverage industry. The machine groups loose bottles or cans into set package sizes, wraps them in plastic film and then heat-shrinks the plastic film to combine them into a package. The plastic film is fed into the machine from large spools and is then cut to the length needed to wrap the film around a pack of goods. The cutting assembly is an important component of the machine to meet the high availability target. Therefore, the blade needs to be set-up and maintained properly. Furthermore, the blade can not be inspected visually during operation due to the blade being enclosed in a metal housing and its fast rotation speed. Monitoring the cutting blades degradation will increase the machines reliability and reduce unexpected downtime caused by failed cuts.
For more information see also this new vs worn blade data.
The 519 files in the dataset are of the format MM-DDTHHMMSS_NUM_modeX.csv, where MM is the month ranging from 1-12 (not calendar month), DD is the day of the month, HHMMSS is the start time of day of recording, NUM is the sample number and X is a mode ranging from 1-8. Each file is a ~8 second sample with a time resolution of 4ms that totals 2048 time-samples for every file.
This dataset is publicly available for anyone to use under the following terms.
von Birgelen, Alexander; Buratti, Davide; Mager, Jens; Niggemann, Oliver: Self-Organizing Maps for Anomaly Localization and Predictive Maintenance in Cyber-Physical Production Systems. In: 51st CIRP Conference on Manufacturing Systems (CIRP CMS 2018) CIRP-CMS, May 2018.
Paper available open access: https://authors.elsevier.com/sd/article/S221282711830307X
IMPROVE has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 678867
Show the degradation of the component over the course of the year. Has the component been replaced at some point? If the wear can be predicted accurately, a remaining useful life prediction can be made in order to determine maintenance windows (predictive maintenance).
There are 8 different modes and several different speeds that the machine can be operated in. Is it possible to infer such modes by time series analysis?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A dataset for a paper in LIBER Quarterly describing the impact of funder open access policies that mandate self-archiving in Australia and Canada.Includes the underlying journal-level data in .csv and .xlsx format, the statistical output for stepwise linear regression from JASP, and the Prism XML file used to generate the figures.Khoo, S. Y.-S., & Lay, B. P. P. (2018). A very long embargo: Journal choice reveals active non-compliance with funder open access policies by Australian and Canadian neuroscientists. LIBER Quarterly, 28. doi:10.18352/lq.10252
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Flenley et al. stated the problem of automatic pollen images recognition more than 50 years ago. It requires solving two tasks: detection and classification. Nowadays, both tasks can be successfully handled using computer vision (CV) methods. However, the main obstacle in these tasks solving is the absence of big open datasets for benchmarking. Existing open datasets for pollen classification task (POLEN23E, POLLEN73S) are quite small and represent different domains, thus, their merging is not straightforward. Moreover, there is no open pollen dataset annotated for the detection task.
Thus, here we present the largest open pollen dataset obtained from the bright-field microscope, including 20 plant species, 2413 images containing 7745 single pollen grains, annotated for the classification and detection tasks.
The images in the dataset are obtained using lighting microscope Olympus BX51. Pollen is stained with Fuchsin. The dataset is made using the Olympus DP71 image viewing system. All the pollen species were collected from Perm Krai, Russia, but typical for Europe in common.
The dataset is related to two domains: allergenic-specific palynology and mellisopalynology. Hence, it contains 13 taxa of allergenic plants (willow, linden, alder, birch, nettle, pigweed, plantain, sorrel, grass, pine, maple, hazel, mugwort) and 8 taxa of honey plants (linden, buckwheat, clover, angelica wild, angelica garden, hill mustard, meadow pink, fireweed).
The allergenic dataset we call POLLEN13L-det. We set the baseline for the detection and recognition on this dataset, see the paper [link to be provided]. We achieved 96.3% of average precision for the detection task and 98.34% of F1 measure for the classification task.
To cite our dataset, please use Khanzhina, Natalia, et al. "Combating data incompetence in pollen images detection and classification for pollinosis prevention." Computers in biology and medicine 140 (2022): 105064.
or
@article{khanzhina2022combating, title={Combating data incompetence in pollen images detection and classification for pollinosis prevention}, author={Khanzhina, Natalia and Filchenkov, Andrey and Minaeva, Natalia and Novoselova, Larisa and Petukhov, Maxim and Kharisova, Irina and Pinaeva, Julia and Zamorin, Georgiy and Putin, Evgeny and Zamyatina, Elena and others}, journal={Computers in biology and medicine}, volume={140}, pages={105064}, year={2022}, publisher={Elsevier} }
This dataset is collected with the great help of Larisa Novoselova, Irina Kharisova, Julia Pinaeva and Georgiy Zamorin.
Researchers are welcome to solve the detection and classification tasks :) Although we set relatively high baseline scores on these tasks, there is still the room for improvement. For example, there are two species that belong to one genus - angelica wild and angelica garden - its shape is almost the same even to most of palynologists, which significantly complicates the recognition.
Make pollen recognition great again!
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Data from: An implicit, conservative electrostatic particle-in-cell algorithm for paraxial magnetic nozzles
Authors: Pedro Jimenez, Luis Chacon, Mario Merino
Contact email: pejimene@ing.uc3m.es
Date: 2024-02-09
Keywords: electric propulsion, plasma simulation, magnetic nozzles, implicit particle-in-cell (PIC)
Version: 1.2
Digital Object Identifier (DOI): 10.5281/zenodo.8081962
License: This dataset is made available under the Open Data Commons Attribution License
Abstract
This dataset contains the data found in the plots of the journal article:
Pedro Jimenez, Luis Chacon, Mario Merino, "An implicit, conservative electrostatic particle-in-cell algorithm for paraxial magnetic nozzles"
The data in this repository are the results of kinetic plasma simulations as described in the reference. For further information on the setup for the simulation please refer to the article.
Data Files
The data files are in .csv format. They were produced in Julia using CSV.jl and DataFrames.jl libraries.
The files are organised following the order of the figures in the article. All the plots are 1D series, the first column corresponding to the x-axis data. Y-axis data is presented in the following columns, the total number of additional columns is equal to the number of line series. The title of each series is found in the first row of the .csv files. Please find below some specificalities in certain figures:
The columns for the time evolution in fig6_left.csv and fig6_right.csv (corresponding to the actual left and right columns in the figure i.e. cases A and B) contain a field tag followed by the corresponding time step (e.g. phi_500).
Due to the different number of nodes, steady state fields for cases A and B are saved in fig8_a-f.csv while cases AF and BF are saved in fig8_a-f_fine.csv.
The rest of the data files should be self descripting
Citation
Any works using this dataset or any part of it in any form shall cite it as follows:
The prefered means of citation is to reference the publication asociated to the jounal article with DOI: 10.1016/j.jcp.2024.112826
The BibTex is also provided for the sake of convinience:
@article{jimenez2024implicit, title={An implicit, conservative electrostatic particle-in-cell algorithm for paraxial magnetic nozzles}, author={Jim{\'e}nez, Pedro and Chac{\'o}n, Luis and Merino, Mario}, journal={Journal of Computational Physics}, pages={112826}, year={2024}, publisher={Elsevier} }
Optionally the dataset can be cited by referencing the corresponding DOI:
https://doi.org/10.5281/zenodo.8081962
Acknowledgments
This dataset was created by the ERC-ZARATHUSTRA project.
The ERC-ZARATHUSTRA project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 950466).
Facebook
Twitter{"Supplementary material to the preprint "Why Electronic Open Access Matters. Analysis of Publications on Energy Topics"DOI: 10.5281/zenodo.4661185AnnotationThe paper analyzes the possible reasons why the growth of the number of publications in the journal Energies for 2011-2020 is more than 10 times higher than the number of publications in Applied Energy or Energy with the best citation rates and published by one of the oldest publishers - Elsevier. It is shown that the significant growth in the number of publications of Energies can be explained by the following features of the journal. The journal with open access to all full-text articles is published only in electronic format, while only 4.6% of publications of the journal Energy and 12.1% of publications of the journal Applied Energy are open access, these journals have printed versions. The journal Energies admits articles of large length, such as 66 pages per publication. Its publication fees are substantially less than those of journals of a similar subject but with print versions, ~$2224 for Energies, $3400 for Energy, and $3400 for Applied Energy, respectively. The rate of publication of articles is several times faster than that of Elsevier journals. The first decision is made in about 16.5 days from the time the article is submitted; acceptance for publication is in 3.5 days. The journal has developed a large community of reviewers and affiliated institutions who are rewarded for their work with reduced fees for journal publications. The journal is less dominated by authors from China and the United States, making it more attractive to authors from other countries. The Lens platform classifies all articles in Energies as belonging to the General Computer Science Subject, which is especially important for finding optimal solutions to various energy problems, while the Energy and Applied Energy journals categorize their publications into more traditional Subjects: General Energy; Pollution; Building and Construction; Civil and Structural Engineering."}
Facebook
TwitterOriginal source here.
This dataset was collected in the Cranfield Multiphase Flow Facility aiming to serve as a benchmark case for statistic process monitoring.
Read this paper for more information.
Cite as:
Yi Cao (2020). A Benchmark Case for Statistical Process Monitoring - Cranfield Multiphase Flow Facility (https://www.mathworks.com/matlabcentral/fileexchange/50938-a-benchmark-case-for-statistical-process-monitoring-cranfield-multiphase-flow-facility), MATLAB Central File Exchange. Retrieved November 23, 2020.
License:
Copyright (c) 2015, Yi Cao All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Background: Dystonia is a heterogeneous disorder that, when refractory to medical treatment, may have a favorable response to deep brain stimulation (DBS). A practical way to have an overview of a research domain is through a bibliometric analysis, as it makes it more accessible for researchers and others outside the field to have an idea of its directions and needs. Objective: To analyze the 100 most cited articles in the use of DBS for dystonia treatment in the last 30 years. Methods: The research protocol was performed in June 2019 in Elsevier’s Scopus database, by retrieving the most cited articles regarding DBS in dystonia. We analyzed authors, year of publication, country, affiliation, and targets of DBS. Results: Articles are mainly published in Movement Disorders (19%), Journal of Neurosurgery (9%), and Neurology (9%). European countries offer significant contributions (57% of our sample). France (192.5 citations/paper) and Germany (144.1 citations/paper) have the highest citation rates of all countries. The United States contributes with 31% of the articles, with 129.8 citations/paper. The publications are focused on General outcomes (46%), followed by Long-term outcomes (12.5%), and Complications (11%), and the leading type of dystonia researched is idiopathic or inherited, isolated, segmental or generalized dystonia, with 27% of articles and 204.3 citations/paper. Conclusions: DBS in dystonia research is mainly published in a handful of scientific journals and focused on the outcomes of the surgery in idiopathic or inherited, isolated, segmental or generalized dystonia, and with globus pallidus internus as the main DBS target.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2022 and single recent year data pertain to citations received during calendar year 2022. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (6) is based on the October 1, 2023 snapshot from Scopus, updated to end of citation year 2022. This work uses Scopus data provided by Elsevier through ICSR Lab (https://www.elsevier.com/icsr/icsrlab). Calculations were performed using all Scopus author profiles as of October 1, 2023. If an author is not on the list it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work.
PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases.
The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, please read the 3 associated PLoS Biology papers that explain the development, validation and use of these metrics and databases. (https://doi.org/10.1371/journal.pbio.1002501, https://doi.org/10.1371/journal.pbio.3000384 and https://doi.org/10.1371/journal.pbio.3000918).
Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a