10 datasets found

f
Entity Normalization
figshare.com
txt
Updated Jun 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leigh Weston (2019). Entity Normalization [Dataset]. http://doi.org/10.6084/m9.figshare.8184365.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8184365.v1
Dataset updated
Jun 4, 2019
Dataset provided by
figshare
Authors
Leigh Weston
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These json documents contain mappings for materials science entity normalization. Each entity is mapped onto the most frequently occurring synonym that is not an acronym.We provide entity normalization for materials science properties (pro), applications (apl), sample descriptors (dsc), symmetry/phase labels (spl), synthesis methods (smt), and characterization methods (cmt).Each term will have a "most common" entity to which it can be mapped. Sub entities are also included which have also been normalized.*Please note: entities that occur infrequently in our corpus are unlikely to be normalized (and less likely to be normalized correctly). In-line with Zipf's law for NLP, infrequently occurring entities make up the largest portion of unique entities in the corpus, and so a large fraction of entiites in these json files are not normalized. However, frequently occurring terms like "XRD" are very likely to be normalized and should be normalized correctly.
Brain tumors 256x256
kaggle.com
Updated Sep 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas (2023). Brain tumors 256x256 [Dataset]. https://www.kaggle.com/datasets/thomasdubail/brain-tumors-256x256/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 19, 2023
Dataset provided by
Kaggle
Authors
Thomas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Introduction: This enhanced dataset builds upon the "Uncovering Knowledge: A Clean Brain Tumor Dataset for Advanced Medical Research." It has undergone significant improvements while acknowledging the original authors' contributions.

Data Source: The initial data was sourced from the brain tumor classification MRI dataset, which is accessible at this link, and was generously shared by Sartaj.

Enhancements Made:

Removal of Redundant Data: Redundant data, including data augmentations like Salt and Pepper noise and geometric transformations, have been removed to ensure sample consistency.

Image Normalization: Images have been normalized using their grayscale histograms, enhancing image quality and comparability.

Resizing with Aspect Ratio Preservation: All images have been resized to a consistent 256 x 256-pixel size while preserving the original aspect ratio, ensuring uniform and detailed images.

Acknowledgment to Original Authors: We extend our sincere gratitude to the original authors of the "Uncovering Knowledge" dataset for their invaluable work in data collection and initial cleaning, which laid the foundation for this enhanced version.

License: This dataset is released under the CC0 license, making it accessible to the medical research community and promoting collaboration and innovation.
VGG-16 with batch normalization
kaggle.com
zip
Updated Dec 15, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PyTorch (2017). VGG-16 with batch normalization [Dataset]. https://www.kaggle.com/pytorch/vgg16bn
Explore at:
zip(514090274 bytes)Available download formats
Dataset updated
Dec 15, 2017
Dataset authored and provided by
PyTorch
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
VGG-16

Very Deep Convolutional Networks for Large-Scale Image Recognition

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

Authors: Karen Simonyan, Andrew Zisserman
https://arxiv.org/abs/1409.1556

VGG Architectures

https://imgur.com/uLXrKxe.jpg" alt="VGG Architecture">

What is a Pre-trained Model?

A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.

Why use a Pre-trained Model?

Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.
Normalisation of Early Modern Science: Inventory of 17th- and 18th-Century...
zenodo.org
explore.openaire.eu
+1more
Updated Oct 23, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Sangiacomo; Raluca Tanasescu; Silvia Donker; Hugo Hogenbirk; Andrea Sangiacomo; Raluca Tanasescu; Silvia Donker; Hugo Hogenbirk (2021). Normalisation of Early Modern Science: Inventory of 17th- and 18th-Century Sources [Dataset]. http://doi.org/10.5281/zenodo.5566681
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5566681
Dataset updated
Oct 23, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrea Sangiacomo; Raluca Tanasescu; Silvia Donker; Hugo Hogenbirk; Andrea Sangiacomo; Raluca Tanasescu; Silvia Donker; Hugo Hogenbirk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains a corpus of early modern natural philosophy works that underlie the European Research Commission-funded Starting Grant “The Normalisation of Natural Philosophy: How Teaching Practices Shaped the Evolution of Early Modern Science,” (grant agreement No. 801653 NaturalPhilosophy), led by Dr. Andrea Sangiacomo at the Faculty of Philosophy at the University of Groningen.

The methodology behind the retrieval, cleaning, and annotation of this repository is described in the paper:

Sangiacomo, Andrea; Tanasescu, Raluca; Donker, Silvia; Hogenbirk, Hugo. 2021. “Mapping the Evolution of Early Modern Natural Philosophy: Corpus Collection and Authority Acknowledgement,” published in the Annals of Science (DOI: 10.1080/00033790.2021.1992502; permanent link: https://doi.org/10.1080/00033790.2021.1992502 [forthcoming on the date of the upload]

The difference between the total number of entries in this data set and the number of titles we ran our analyses on comes from listing multiple volumes of the same work as one single entry. The work on this corpus continues and further versions will be gradually uploaded, with supplementary information about publishers and publication places.

The dictionaries from which we selected the data in worksheets 2-5 are the following:

Wiep van Bunge, Henri Krop, Bart Leeuwenburgh, Paul Schuurman, Han van Ruler and Michiel Wielema, Dictionary of Seventeenth- and Eighteenth-Century Dutch Philosophers (London: Bloomsbury, 2003);

John Yolton, Valdimir Price and John Stephens. Dictionary of Eighteenth-Century British Philosophers (London: Bloomsbury, 1999);

Andrew Pyle. Dictionary of Seventeenth-Century British Philosophers (London: Bloomsbury, 2000);

Luc Foisneau. Dictionary of Seventeenth-Century French Philosophers (London: Bloomsbury, 2008);

Heiner F. Klemme and Manfred Kuehn. Dictionary of Eighteenth-Century Philosophers (London: Bloomsbury, 2011).

University of Groningen Team:

Andrea Sangiacomo (principal investigator)

Raluca Tanasescu (postdoctoral researcher)

Silvia Donker and Hugo Hogenbirk (PhD students)

Cristian A. Marocico (scientific programmer, Center for Information Technology)

Wim Breakman (bibliographer, University of Groningen Library)

Special thanks to
f
Derivatives and inverse of cascaded linear+nonlinear neural models
plos.figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. Martinez-Garcia; P. Cyriac; T. Batard; M. Bertalmío; J. Malo (2023). Derivatives and inverse of cascaded linear+nonlinear neural models [Dataset]. http://doi.org/10.1371/journal.pone.0201326
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0201326
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
M. Martinez-Garcia; P. Cyriac; T. Batard; M. Bertalmío; J. Malo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences. However, the conventional literature is usually too focused on only describing the forward input-output transform. Instead, in this work we present the mathematics of such cascades beyond the forward transform, namely the Jacobian matrices and the inverse. The fundamental reason for this analytical treatment is that it offers useful analytical insight into the psychophysics, the physiology, and the function of the visual system. For instance, we show how the trends of the sensitivity (volume of the discrimination regions) and the adaptation of the receptive fields can be identified in the expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t. the parameters shows which aspects of the model have bigger impact in the response, and hence their relative relevance. The analytic inverse implies conditions for the response and model parameters to ensure appropriate decoding. From the experimental and applied perspective, (a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian matrices w.r.t. the parameters are convenient to learn the model from classical experiments or alternative goal optimization, and (c) the inverse is a promising model-based alternative to blind machine-learning methods for neural decoding that do not include meaningful biological information. The theory is checked by building and testing a vision model that actually follows a modular Linear+Nonlinear program. Our illustrative derivable and invertible model consists of a cascade of modules that account for brightness, contrast, energy masking, and wavelet masking. To stress the generality of this modular setting we show examples where some of the canonical Divisive Normalization modules are substituted by equivalent modules such as the Wilson-Cowan interaction model (at the V1 cortex) or a tone-mapping model (at the retina).
f
Underwater Drowning Detection Dataset
figshare.com
zip
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamad Alzaabi; Saif Alzaabi; Sarah Kohail (2025). Underwater Drowning Detection Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.29497235.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29497235.v2
Dataset updated
Jul 7, 2025
Dataset provided by
figshare
Authors
Hamad Alzaabi; Saif Alzaabi; Sarah Kohail
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Underwater Drowning Detection DatasetThis dataset contains 5,613 manually annotated underwater images for drowning detection research, captured in controlled swimming pool environments. It provides a balanced distribution of three behavioral states:Swimming (1,871 images)Struggling (1,871 images)Drowning (1,871 images)All images were collected under real underwater conditions and annotated for object detection tasks using the YOLO format.Key FeaturesHigh-resolution underwater images (640×640 pixels, RGB)YOLO .txt annotations with bounding boxes for three behavior classesBalanced class distribution to minimize model biasData collected ethically with lifeguard supervision and participant consentIncludes realistic challenges such as water distortion and lighting variabilityTechnical DetailsTotal Images: 5,613Training/Validation Split: 4,488 / 1,125Classes: Swimming, Struggling, DrowningFormat: JPEG + YOLO annotation filesResolution: 640×640 pixelsBaseline Performance: YOLOv8n achieved 97.5% mAP@50 on this datasetAnnotation FormatEach image has a corresponding .txt file with annotations in YOLO format, where each line follows this structure: Field Descriptions:class_id: Integer label for the class0 = Swimming1 = Struggling2 = Drowningx_center, y_center: Normalized center coordinates of the bounding box (values between 0.0 and 1.0)width, height: Normalized width and height of the bounding box (values between 0.0 and 1.0)Example Annotation:0 0.509896 0.568519 0.453125 0.581481This line indicates a “Swimming” detection (class_id = 0) with a bounding box centered at 50.99% (horizontal) and 56.85% (vertical) of the image dimensions, covering 45.31% of the width and 58.15% of the height.Dataset Folder Structuredatasets/├── images/│ ├── train/│ │ ├── frame_00001.jpg│ │ └── ...│ └── val/│ ├── frame_04489.jpg│ └── ...│├── labels/│ ├── train/│ │ ├── frame_00001.txt│ │ └── ...│ └── val/│ ├── frame_04489.txt│ └── ...│├── classes.txt├── README.mdUse and ApplicationsThis dataset is designed to support the development and evaluation of real-time AI systems for aquatic safety, including:Drowning detection modelsMulti-class object detection in underwater environmentsResearch in underwater computer vision and human activity recognitionCitationIf you use this dataset, please cite:graphqlCopyEdit@dataset{underwater_drowning_detection_2025, title = {Underwater Drowning Detection Dataset}, author = {Hamad Alzaabi and Saif Alzaabi and Sarah Kohail}, year = {2025}, publisher = {Figshare}, note = {Manually annotated underwater images for drowning detection research}}Please also cite the related publication:mathematicaCopyEdit@inproceedings{Alzaabi2025, author = {Hamad Alzaabi and Saif Alzaabi and Sarah Kohail}, title = {Multi‑Swimmer Drowning Detection Using a Custom Annotated Underwater Dataset and Real‑Time AI}, booktitle = {Proceedings of the International Conference on Image Analysis and Processing (ICIAP)}, year = {2025}}
License Plates - 1 209 438 Plates, OCR Dataset
kaggle.com
Updated Jul 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). License Plates - 1 209 438 Plates, OCR Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/license-plates-1-209-438-ocr-plates
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 28, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
License Plates - Object Detection dataset

Over 1.2 million annotated license plates from vehicles around the world. This dataset is tailored for License Plate Recognition tasks and includes images from the differnent Web domains. Annotation details are provided in the About section below.

💴 For Commercial Usage: Full version of the dataset includes 1,200 000 images & OCR Annotations, leave a request on TrainingData to buy the dataset

About

Variables in .csv files:

file_name - filename of the original car photo

license_plate.country - country where the vehicle was captured

bbox - normalized Bounding Box labeling of the car

license_plate.visibility - the visibility type of the license plate

license_plate.id - unique license plate's id

license_plate.mask - normalized coordinates of the license plate

license_plate.rows_count - single-line or double-line number

license_plate.number - recognized text of the license plate

license_plate.serial - only for UAE numbers - license plate series

license_plate.region - only for UAE numbers - license plate subregion

license_plate.color - only for Saudi Arabia - color of the international plate code

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

TrainingData provides high-quality data annotation tailored to your needs.

keywords: license plate dataset, car license plate detection, license plate recognition, automatic number plate recognition, vehicle license plate location, number plate dataset, labelled license plates, license plate characters, object detection dataset, license plate OCR, automatic vehicle number plate detection, computer vision license plate
m
BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset
data.mendeley.com
Updated Nov 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S M Rayeed (2023). BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset [Dataset]. http://doi.org/10.17632/pbb3w3f92y.3
Explore at:
Unique identifier
https://doi.org/10.17632/pbb3w3f92y.3
Dataset updated
Nov 5, 2023
Authors
S M Rayeed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The BdSL47 is the original dataset constructed under the supervision of the Systems and Software Lab (SSL), Department of Computer Science and Engineering (CSE), Islamic University of Technology (IUT) and Department of Computer Science and Engineering (CSE), United International University (UIU), Dhaka, Bangladesh. The ownership of this dataset belongs to the authors and they have administered the dataset collection processes by taking informed consent from the participants/singers of the Bangla sign Alphabets and Digits. The dataset was constructed by collecting Bangla alphabet signs from 10 signers in a controlled setting. We employed varying factors of the users like age, gender, hand shape, and skin color. We have also incorporated different challenges while collecting data like scaling, translation, hand rotation, hand orientation, lighting ambiance, and background. We have collected webcam images of 47 signs (10 Bangla Sign Digits and 37 Bangla Sign Alphabet) and resized them to 640x480. For each of the RGB image samples, we have detected the hand key points via the MediaPipe library. Then we generate CSV files containing x, y, and depth coordinate values of 21 hand key points extracted from these image samples and constitute 21×3, or 63 input features.

The dataset contains 47000 RGB input images of 47 signs (10 digits, 37 letters) of Bangla Sign Language. The images have been processed via the MediaPipe framework, which is designed to detect predefined 21 hand key points from a sample and provide normalized x & y coordinate values and an estimated depth value. The 3D coordinate values were stored in .csv files (1 file contains information on 100 image samples of the same sign). The dataset contains 470 .csv files in total.

There are two folders named as “Bangla Sign Language Dataset - Sign Alphabets” and “Bangla Sign Language Dataset - Sign Digits”. Under each folder, user-wise folders are given that contain sign images (input images (raw images in jpg format), and CSV files (normalized 3D coordinates of 21 hand keypoints with corresponding class labels). All the files of around 1.12GB are available for direct download using the ‘Download All’ option by going to the doi: 10.17632/pbb3w3f92y.3. The images and CSV files can be easily read or processed using Python or any other programming language (Python 3.10.3). Please cite our dataset if you have used it in your research following the format, “Rayeed, S M; Tuba, Sidratul Tamzida; Mahmud, Hasan; Mazumder, Mumtahin Habib Ullah; Hossain, Md. Saddam; Hasan, Md. Kamrul (2023), “BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset”, Mendeley Data, V3, doi: 10.17632/pbb3w3f92y.3”. The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International license. The link to the code can be found here: https://github.com/SMRayeed/BdSL47-Recognition.
d
VisQUIC
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gahtan, Barak; Robert J. Shahla (2024). VisQUIC [Dataset]. http://doi.org/10.7910/DVN/PKXFN7
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/PKXFN7
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Gahtan, Barak; Robert J. Shahla
Description
QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce \textit{VisQUIC}, a labeled image-dataset with configurable parameters of window length, pixel resolution, normalization, and labels. To develop the dataset, we captured QUIC traces from more than $9,000$ websites and more than $72,000$ traces over a four-month period. The captured traces are converted into learnable, customizable RGB images, enabling an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client--server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset. Our use-case example is only one demonstration of the dataset’s application; a number of such uses exist.
f
Evaluation of Reference Genes for Accurate Normalization of Gene Expression...
plos.figshare.com
application/cdfv2
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tsuyoshi Imai; Benjamin E. Ubi; Takanori Saito; Takaya Moriguchi (2023). Evaluation of Reference Genes for Accurate Normalization of Gene Expression for Real Time-Quantitative PCR in Pyrus pyrifolia Using Different Tissue Samples and Seasonal Conditions [Dataset]. http://doi.org/10.1371/journal.pone.0086492
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0086492
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Tsuyoshi Imai; Benjamin E. Ubi; Takanori Saito; Takaya Moriguchi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We have evaluated suitable reference genes for real time (RT)-quantitative PCR (qPCR) analysis in Japanese pear (Pyrus pyrifolia). We tested most frequently used genes in the literature such as β-Tubulin, Histone H3, Actin, Elongation factor-1α, Glyceraldehyde-3-phosphate dehydrogenase, together with newly added genes Annexin, SAND and TIP41. A total of 17 primer combinations for these eight genes were evaluated using cDNAs synthesized from 16 tissue samples from four groups, namely: flower bud, flower organ, fruit flesh and fruit skin. Gene expression stabilities were analyzed using geNorm and NormFinder software packages or by ΔCt method. geNorm analysis indicated three best performing genes as being sufficient for reliable normalization of RT-qPCR data. Suitable reference genes were different among sample groups, suggesting the importance of validation of gene expression stability of reference genes in the samples of interest. Ranking of stability was basically similar between geNorm and NormFinder, suggesting usefulness of these programs based on different algorithms. ΔCt method suggested somewhat different results in some groups such as flower organ or fruit skin; though the overall results were in good correlation with geNorm or NormFinder. Gene expression of two cold-inducible genes PpCBF2 and PpCBF4 were quantified using the three most and the three least stable reference genes suggested by geNorm. Although normalized quantities were different between them, the relative quantities within a group of samples were similar even when the least stable reference genes were used. Our data suggested that using the geometric mean value of three reference genes for normalization is quite a reliable approach to evaluating gene expression by RT-qPCR. We propose that the initial evaluation of gene expression stability by ΔCt method, and subsequent evaluation by geNorm or NormFinder for limited number of superior gene candidates will be a practical way of finding out reliable reference genes.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Leigh Weston (2019). Entity Normalization [Dataset]. http://doi.org/10.6084/m9.figshare.8184365.v1

Entity Normalization

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.8184365.v1

Dataset updated

Jun 4, 2019

Dataset provided by

figshare

Authors

Leigh Weston

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

These json documents contain mappings for materials science entity normalization. Each entity is mapped onto the most frequently occurring synonym that is not an acronym.We provide entity normalization for materials science properties (pro), applications (apl), sample descriptors (dsc), symmetry/phase labels (spl), synthesis methods (smt), and characterization methods (cmt).Each term will have a "most common" entity to which it can be mapped. Sub entities are also included which have also been normalized.*Please note: entities that occur infrequently in our corpus are unlikely to be normalized (and less likely to be normalized correctly). In-line with Zipf's law for NLP, infrequently occurring entities make up the largest portion of unique entities in the corpus, and so a large fraction of entiites in these json files are not normalized. However, frequently occurring terms like "XRD" are very likely to be normalized and should be normalized correctly.

Clear search

Close search

Google apps

Main menu

Entity Normalization

Brain tumors 256x256

VGG-16 with batch normalization

VGG-16

Very Deep Convolutional Networks for Large-Scale Image Recognition

VGG Architectures

What is a Pre-trained Model?

Why use a Pre-trained Model?

Normalisation of Early Modern Science: Inventory of 17th- and 18th-Century...

Derivatives and inverse of cascaded linear+nonlinear neural models

Underwater Drowning Detection Dataset

License Plates - 1 209 438 Plates, OCR Dataset

License Plates - Object Detection dataset

💴 For Commercial Usage: Full version of the dataset includes 1,200 000 images & OCR Annotations, leave a request on TrainingData to buy the dataset

About

Variables in .csv files:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

TrainingData provides high-quality data annotation tailored to your needs.

BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset

VisQUIC

Evaluation of Reference Genes for Accurate Normalization of Gene Expression...

Entity Normalization