Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These json documents contain mappings for materials science entity normalization. Each entity is mapped onto the most frequently occurring synonym that is not an acronym.We provide entity normalization for materials science properties (pro), applications (apl), sample descriptors (dsc), symmetry/phase labels (spl), synthesis methods (smt), and characterization methods (cmt).Each term will have a "most common" entity to which it can be mapped. Sub entities are also included which have also been normalized.*Please note: entities that occur infrequently in our corpus are unlikely to be normalized (and less likely to be normalized correctly). In-line with Zipf's law for NLP, infrequently occurring entities make up the largest portion of unique entities in the corpus, and so a large fraction of entiites in these json files are not normalized. However, frequently occurring terms like "XRD" are very likely to be normalized and should be normalized correctly.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Introduction: This enhanced dataset builds upon the "Uncovering Knowledge: A Clean Brain Tumor Dataset for Advanced Medical Research." It has undergone significant improvements while acknowledging the original authors' contributions.
Data Source: The initial data was sourced from the brain tumor classification MRI dataset, which is accessible at this link, and was generously shared by Sartaj.
Enhancements Made:
Removal of Redundant Data: Redundant data, including data augmentations like Salt and Pepper noise and geometric transformations, have been removed to ensure sample consistency.
Image Normalization: Images have been normalized using their grayscale histograms, enhancing image quality and comparability.
Resizing with Aspect Ratio Preservation: All images have been resized to a consistent 256 x 256-pixel size while preserving the original aspect ratio, ensuring uniform and detailed images.
Acknowledgment to Original Authors: We extend our sincere gratitude to the original authors of the "Uncovering Knowledge" dataset for their invaluable work in data collection and initial cleaning, which laid the foundation for this enhanced version.
License: This dataset is released under the CC0 license, making it accessible to the medical research community and promoting collaboration and innovation.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
Authors: Karen Simonyan, Andrew Zisserman
https://arxiv.org/abs/1409.1556
https://imgur.com/uLXrKxe.jpg" alt="VGG Architecture">
A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.
Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a corpus of early modern natural philosophy works that underlie the European Research Commission-funded Starting Grant “The Normalisation of Natural Philosophy: How Teaching Practices Shaped the Evolution of Early Modern Science,” (grant agreement No. 801653 NaturalPhilosophy), led by Dr. Andrea Sangiacomo at the Faculty of Philosophy at the University of Groningen.
The methodology behind the retrieval, cleaning, and annotation of this repository is described in the paper:
The difference between the total number of entries in this data set and the number of titles we ran our analyses on comes from listing multiple volumes of the same work as one single entry. The work on this corpus continues and further versions will be gradually uploaded, with supplementary information about publishers and publication places.
The dictionaries from which we selected the data in worksheets 2-5 are the following:
University of Groningen Team:
Special thanks to
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences. However, the conventional literature is usually too focused on only describing the forward input-output transform. Instead, in this work we present the mathematics of such cascades beyond the forward transform, namely the Jacobian matrices and the inverse. The fundamental reason for this analytical treatment is that it offers useful analytical insight into the psychophysics, the physiology, and the function of the visual system. For instance, we show how the trends of the sensitivity (volume of the discrimination regions) and the adaptation of the receptive fields can be identified in the expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t. the parameters shows which aspects of the model have bigger impact in the response, and hence their relative relevance. The analytic inverse implies conditions for the response and model parameters to ensure appropriate decoding. From the experimental and applied perspective, (a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian matrices w.r.t. the parameters are convenient to learn the model from classical experiments or alternative goal optimization, and (c) the inverse is a promising model-based alternative to blind machine-learning methods for neural decoding that do not include meaningful biological information. The theory is checked by building and testing a vision model that actually follows a modular Linear+Nonlinear program. Our illustrative derivable and invertible model consists of a cascade of modules that account for brightness, contrast, energy masking, and wavelet masking. To stress the generality of this modular setting we show examples where some of the canonical Divisive Normalization modules are substituted by equivalent modules such as the Wilson-Cowan interaction model (at the V1 cortex) or a tone-mapping model (at the retina).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Underwater Drowning Detection DatasetThis dataset contains 5,613 manually annotated underwater images for drowning detection research, captured in controlled swimming pool environments. It provides a balanced distribution of three behavioral states:Swimming (1,871 images)Struggling (1,871 images)Drowning (1,871 images)All images were collected under real underwater conditions and annotated for object detection tasks using the YOLO format.Key FeaturesHigh-resolution underwater images (640×640 pixels, RGB)YOLO .txt annotations with bounding boxes for three behavior classesBalanced class distribution to minimize model biasData collected ethically with lifeguard supervision and participant consentIncludes realistic challenges such as water distortion and lighting variabilityTechnical DetailsTotal Images: 5,613Training/Validation Split: 4,488 / 1,125Classes: Swimming, Struggling, DrowningFormat: JPEG + YOLO annotation filesResolution: 640×640 pixelsBaseline Performance: YOLOv8n achieved 97.5% mAP@50 on this datasetAnnotation FormatEach image has a corresponding .txt file with annotations in YOLO format, where each line follows this structure: Field Descriptions:class_id: Integer label for the class0 = Swimming1 = Struggling2 = Drowningx_center, y_center: Normalized center coordinates of the bounding box (values between 0.0 and 1.0)width, height: Normalized width and height of the bounding box (values between 0.0 and 1.0)Example Annotation:0 0.509896 0.568519 0.453125 0.581481This line indicates a “Swimming” detection (class_id = 0) with a bounding box centered at 50.99% (horizontal) and 56.85% (vertical) of the image dimensions, covering 45.31% of the width and 58.15% of the height.Dataset Folder Structuredatasets/├── images/│ ├── train/│ │ ├── frame_00001.jpg│ │ └── ...│ └── val/│ ├── frame_04489.jpg│ └── ...│├── labels/│ ├── train/│ │ ├── frame_00001.txt│ │ └── ...│ └── val/│ ├── frame_04489.txt│ └── ...│├── classes.txt├── README.mdUse and ApplicationsThis dataset is designed to support the development and evaluation of real-time AI systems for aquatic safety, including:Drowning detection modelsMulti-class object detection in underwater environmentsResearch in underwater computer vision and human activity recognitionCitationIf you use this dataset, please cite:graphqlCopyEdit@dataset{underwater_drowning_detection_2025, title = {Underwater Drowning Detection Dataset}, author = {Hamad Alzaabi and Saif Alzaabi and Sarah Kohail}, year = {2025}, publisher = {Figshare}, note = {Manually annotated underwater images for drowning detection research}}Please also cite the related publication:mathematicaCopyEdit@inproceedings{Alzaabi2025, author = {Hamad Alzaabi and Saif Alzaabi and Sarah Kohail}, title = {Multi‑Swimmer Drowning Detection Using a Custom Annotated Underwater Dataset and Real‑Time AI}, booktitle = {Proceedings of the International Conference on Image Analysis and Processing (ICIAP)}, year = {2025}}
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Over 1.2 million annotated license plates from vehicles around the world. This dataset is tailored for License Plate Recognition tasks and includes images from the differnent Web domains. Annotation details are provided in the About section below.
keywords: license plate dataset, car license plate detection, license plate recognition, automatic number plate recognition, vehicle license plate location, number plate dataset, labelled license plates, license plate characters, object detection dataset, license plate OCR, automatic vehicle number plate detection, computer vision license plate
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BdSL47 is the original dataset constructed under the supervision of the Systems and Software Lab (SSL), Department of Computer Science and Engineering (CSE), Islamic University of Technology (IUT) and Department of Computer Science and Engineering (CSE), United International University (UIU), Dhaka, Bangladesh. The ownership of this dataset belongs to the authors and they have administered the dataset collection processes by taking informed consent from the participants/singers of the Bangla sign Alphabets and Digits. The dataset was constructed by collecting Bangla alphabet signs from 10 signers in a controlled setting. We employed varying factors of the users like age, gender, hand shape, and skin color. We have also incorporated different challenges while collecting data like scaling, translation, hand rotation, hand orientation, lighting ambiance, and background. We have collected webcam images of 47 signs (10 Bangla Sign Digits and 37 Bangla Sign Alphabet) and resized them to 640x480. For each of the RGB image samples, we have detected the hand key points via the MediaPipe library. Then we generate CSV files containing x, y, and depth coordinate values of 21 hand key points extracted from these image samples and constitute 21×3, or 63 input features.
The dataset contains 47000 RGB input images of 47 signs (10 digits, 37 letters) of Bangla Sign Language. The images have been processed via the MediaPipe framework, which is designed to detect predefined 21 hand key points from a sample and provide normalized x & y coordinate values and an estimated depth value. The 3D coordinate values were stored in .csv files (1 file contains information on 100 image samples of the same sign). The dataset contains 470 .csv files in total.
There are two folders named as “Bangla Sign Language Dataset - Sign Alphabets” and “Bangla Sign Language Dataset - Sign Digits”. Under each folder, user-wise folders are given that contain sign images (input images (raw images in jpg format), and CSV files (normalized 3D coordinates of 21 hand keypoints with corresponding class labels). All the files of around 1.12GB are available for direct download using the ‘Download All’ option by going to the doi: 10.17632/pbb3w3f92y.3. The images and CSV files can be easily read or processed using Python or any other programming language (Python 3.10.3). Please cite our dataset if you have used it in your research following the format, “Rayeed, S M; Tuba, Sidratul Tamzida; Mahmud, Hasan; Mazumder, Mumtahin Habib Ullah; Hossain, Md. Saddam; Hasan, Md. Kamrul (2023), “BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset”, Mendeley Data, V3, doi: 10.17632/pbb3w3f92y.3”. The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International license. The link to the code can be found here: https://github.com/SMRayeed/BdSL47-Recognition.
QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce \textit{VisQUIC}, a labeled image-dataset with configurable parameters of window length, pixel resolution, normalization, and labels. To develop the dataset, we captured QUIC traces from more than $9,000$ websites and more than $72,000$ traces over a four-month period. The captured traces are converted into learnable, customizable RGB images, enabling an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client--server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset. Our use-case example is only one demonstration of the dataset’s application; a number of such uses exist.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We have evaluated suitable reference genes for real time (RT)-quantitative PCR (qPCR) analysis in Japanese pear (Pyrus pyrifolia). We tested most frequently used genes in the literature such as β-Tubulin, Histone H3, Actin, Elongation factor-1α, Glyceraldehyde-3-phosphate dehydrogenase, together with newly added genes Annexin, SAND and TIP41. A total of 17 primer combinations for these eight genes were evaluated using cDNAs synthesized from 16 tissue samples from four groups, namely: flower bud, flower organ, fruit flesh and fruit skin. Gene expression stabilities were analyzed using geNorm and NormFinder software packages or by ΔCt method. geNorm analysis indicated three best performing genes as being sufficient for reliable normalization of RT-qPCR data. Suitable reference genes were different among sample groups, suggesting the importance of validation of gene expression stability of reference genes in the samples of interest. Ranking of stability was basically similar between geNorm and NormFinder, suggesting usefulness of these programs based on different algorithms. ΔCt method suggested somewhat different results in some groups such as flower organ or fruit skin; though the overall results were in good correlation with geNorm or NormFinder. Gene expression of two cold-inducible genes PpCBF2 and PpCBF4 were quantified using the three most and the three least stable reference genes suggested by geNorm. Although normalized quantities were different between them, the relative quantities within a group of samples were similar even when the least stable reference genes were used. Our data suggested that using the geometric mean value of three reference genes for normalization is quite a reliable approach to evaluating gene expression by RT-qPCR. We propose that the initial evaluation of gene expression stability by ΔCt method, and subsequent evaluation by geNorm or NormFinder for limited number of superior gene candidates will be a practical way of finding out reliable reference genes.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These json documents contain mappings for materials science entity normalization. Each entity is mapped onto the most frequently occurring synonym that is not an acronym.We provide entity normalization for materials science properties (pro), applications (apl), sample descriptors (dsc), symmetry/phase labels (spl), synthesis methods (smt), and characterization methods (cmt).Each term will have a "most common" entity to which it can be mapped. Sub entities are also included which have also been normalized.*Please note: entities that occur infrequently in our corpus are unlikely to be normalized (and less likely to be normalized correctly). In-line with Zipf's law for NLP, infrequently occurring entities make up the largest portion of unique entities in the corpus, and so a large fraction of entiites in these json files are not normalized. However, frequently occurring terms like "XRD" are very likely to be normalized and should be normalized correctly.