39 datasets found

f
Data from: Similarity-Principle-Based Machine Learning Method for Clinical...
tandf.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Susan Hwang; Mark Chang (2023). Similarity-Principle-Based Machine Learning Method for Clinical Trials and Beyond [Dataset]. http://doi.org/10.6084/m9.figshare.20272392.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20272392.v2
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Susan Hwang; Mark Chang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With recent success in supervised learning, artificial intelligence (AI) and machine learning (ML) can play a vital role in precision medicine. Deep learning neural networks have been used in drug discovery when larger data is available. However, applications of machine learning in clinical trials with small sample size (around a few hundreds) are limited. We propose a Similarity-Principle-Based Machine Learning (SBML) method, which is applicable for small and large sample size problems. In SBML, the attribute-scaling factors are introduced to objectively determine the relative importance of each attribute (predictor). The gradient method is used in learning (training), that is, updating the attribute-scaling factors. We evaluate SBML when the sample size is small and investigate the effects of tuning parameters. Simulations show that SBML achieves better predictions in terms of mean squared errors for various complicated nonlinear situations than full linear models, optimal and ridge regressions, mixed effect models, support vector machine and decision tree methods.
f
EDA augmentation parameters.
plos.figshare.com
xls
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda (2024). EDA augmentation parameters. [Dataset]. http://doi.org/10.1371/journal.pone.0310707.t009
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310707.t009
Dataset updated
Sep 26, 2024
Dataset provided by
PLOS ONE
Authors
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Over the last ten years, social media has become a crucial data source for businesses and researchers, providing a space where people can express their opinions and emotions. To analyze this data and classify emotions and their polarity in texts, natural language processing (NLP) techniques such as emotion analysis (EA) and sentiment analysis (SA) are employed. However, the effectiveness of these tasks using machine learning (ML) and deep learning (DL) methods depends on large labeled datasets, which are scarce in languages like Spanish. To address this challenge, researchers use data augmentation (DA) techniques to artificially expand small datasets. This study aims to investigate whether DA techniques can improve classification results using ML and DL algorithms for sentiment and emotion analysis of Spanish texts. Various text manipulation techniques were applied, including transformations, paraphrasing (back-translation), and text generation using generative adversarial networks, to small datasets such as song lyrics, social media comments, headlines from national newspapers in Chile, and survey responses from higher education students. The findings show that the Convolutional Neural Network (CNN) classifier achieved the most significant improvement, with an 18% increase using the Generative Adversarial Networks for Sentiment Text (SentiGan) on the Aggressiveness (Seriousness) dataset. Additionally, the same classifier model showed an 11% improvement using the Easy Data Augmentation (EDA) on the Gender-Based Violence dataset. The performance of the Bidirectional Encoder Representations from Transformers (BETO) also improved by 10% on the back-translation augmented version of the October 18 dataset, and by 4% on the EDA augmented version of the Teaching survey dataset. These results suggest that data augmentation techniques enhance performance by transforming text and adapting it to the specific characteristics of the dataset. Through experimentation with various augmentation techniques, this research provides valuable insights into the analysis of subjectivity in Spanish texts and offers guidance for selecting algorithms and techniques based on dataset features.
Soil Types
kaggle.com
Updated Jun 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasansha Satpathy (2021). Soil Types [Dataset]. https://www.kaggle.com/prasanshasatpathy/soil-types/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 22, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Prasansha Satpathy
Description
Context

The dataset consists of 5 varieties of soil images in 5 directories or folders. The dataset is a very small dataset meant for beginners to build ML models using this dataset currently. A small dataset helps learn without wasting hefty time A small dataset helps in learning without wasting hefty time in training the model for half an hour or more and expensive computation. The results won't be really great as the dataset is really low and thus overfitting and other issues. A small dataset helps learn without wasting hefty time. A small dataset helps in learning without wasting hefty time in training the model for half an hour or more and expensive computation. It will be probably updated soon to a larger dataset with better images and collection.

Content

The dataset consists of 5 varieties of soil images in 5 directories or folders. The dataset was made because I was unable to find any reliable image dataset for soil varieties. The dataset will be soon updated to better richness. For which I will be soon crowdsourcing.

The dataset is without annotation, for same another concept of real-time augmentation can be applied. One can go through this notebook for learning how... https://www.kaggle.com/prasanshasatpathy/soil-type-image-classification

Acknowledgements

I have made the initial set of this small dataset. Soon I expect collaborations to increase the size and types of the dataset. The method for contributing would be released soon.

Inspiration

Better dataset with less complexity and really necessary.
d
UiT_TILs - Replication Data for \"A Pragmatic Machine Learning Approach to...
search.dataone.org
dataverse.no
Updated Jan 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kilvaer, Thomas K (2024). UiT_TILs - Replication Data for \"A Pragmatic Machine Learning Approach to Quantify Tumor Infiltrating Lymphocytes in Whole Slide Images\" [Dataset]. http://doi.org/10.18710/4YN9SZ
Explore at:
Unique identifier
https://doi.org/10.18710/4YN9SZ
Dataset updated
Jan 5, 2024
Dataset provided by
DataverseNO
Authors
Kilvaer, Thomas K
Time period covered
Jan 1, 1993 - Jan 1, 2003
Description
This dataset can be used to replicate the findings in "A Pragmatic Machine Learning Approach to Quantify Tumor Infiltrating Lymphocytes in Whole Slide Images". The motivation for this paper is that increased levels of tumor infiltrating lymphocytes (TILs) indicate favorable outcomes in many types of cancer. Our aim is to leverage computational pathology to automatically quantify TILs in standard diagnostic whole-tissue hematoxylin and eosin stained section slides (H&E slides). Our approach is to transfer an open source machine learning method for segmentation and classification of nuclei in H&E slides trained on public data to TIL quantification without manual labeling of our data. Our results show that improved data augmentation improves immune cell detection in H&E WSIs. Moreover, the resulting TIL quantification correlates to patient prognosis and compares favorably to the current state-of-the-art method for immune cell detection in non-small lung cancer (current standard CD8 cells in DAB stained TMAs HR 0.34 95% CI 0.17-0.68 vs TILs in HE WSIs: HoVer-Net PanNuke Model HR 0.30 95% CI 0.15-0.60). Moreover, we implemented a cloud based system to train, deploy, and visually inspect machine learning based annotation for H&E slides. Our pragmatic approach bridges the gap between machine learning research, translational clinical research and clinical implementation. However, validation in prospective studies is needed to assert that the method works in a clinical setting. The dataset is comprised of three parts: 1) Twenty image patches with and without overlays used by pathologists to manually evaluate the output of the deep learning models, 2) The models trained and subsequently used for inference in the paper, 3) the patient dataset with corresponding image patches used to clinically validate the output of the deep learning models. The tissue samples were collected from patients diagnosed between 1993 and 2003. Supplementing information was collected retrospectively in the time period 2006-2017. The images were produced in 2017.
t
Clay, Viviane (2021). Dataset: Data from neural network training in the...
service.tib.eu
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Clay, Viviane (2021). Dataset: Data from neural network training in the obstacle tower environment to investigate embodied, weakly supervised learning. https://doi.org/10.26249/FK2/BFDUZO [Dataset]. https://service.tib.eu/ldmservice/dataset/osn-doi-10-26249-fk2-bfduzo
Explore at:
Dataset updated
May 16, 2025
Description
Abstract: Description: This repository presents data collected to investigate the role of embodiment and supervision in learning. This is done inside a simulated 3D maze world with a navigation task using mainly visual input in the form of RGB images. The main contribution of this data repository is to provide a network model trained in this environment with weak supervision and a closed loop between action and perception. Additionally, control networks are provided which were trained with varying degrees of supervision and embodiment. In the corresponding paper [1] the representations of these networks are compared based on sparsity measures and well as content of the encodings and the possibility to extract semantic labels. For the training of the control conditions several new data sets were created which are also included here. They contain a collection of images from the simulated world with corresponding semantic labels. Overall, they provide a good basis for further analysis and a more in-depth investigation of representation learning and the effect of embodiment and supervision on representations. Abstract: Steps to reproduce: Data was generated through a 3D simulation of a maze environment called Obstacle Tower. The data of interest are the trained neural network weights and the networks activations corresponding with different input frames. Three main networks were trained. A reinforcement learning agent which trained through interaction with the simulated environment, an autoencoder trained to reconstruct images collected by the agent and a classifier, trained to classify objects in the images. Exact training and testing conditions, hyperparameter and network structure are provided in the corresponding paper. For the training of the reinforcement learning agent the Unity ml-agents toolkit PPO implementation is used with small modifications for extra data collection and control experiments. The code we used can be found here: https://github.com/vkakerbeck/ml-agents-dev . Model checkpoint files are saved for different points in training but mostly the final version of the network is analysed in the corresponding paper [1] . The autoencoder and classifier are trained using Python with TensorFlow and Keras. The corresponding code can be found here: https://github.com/vkakerbeck/Learning-World-Representations/tree/master/DataAnalysis . The data also contains activations in the hidden layer of the network corresponding to 4000 test images for all three networks. Code for this can be found in the same GitHub repository. The datasets used for training the autoencoder and classifier were created by collecting observations in the Obstacle Tower environment using the trained agent. These observations were then labelled automatically, and the labels were cross checked by hand. A Description of the individual files is included in the data folder (Description.txt). Due to storage constraints no all model checkpoint files used to create figure 6 of the paper could be uploaded. However, feel free to contact me (vkakerbeck[at]uos.de) if you are intrested in these detailed checkpoint files of the controll runs and I will make them available to you.
Neural network ensembles and FEFF spectra for multi-modal small molecule...
zenodo.org
explore.openaire.eu
bz2
Updated Jun 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew R. Carbone; Matthew R. Carbone; Deyu Lu; Deyu Lu (2023). Neural network ensembles and FEFF spectra for multi-modal small molecule chemical motif prediction [Dataset]. http://doi.org/10.5281/zenodo.8087823
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.8087823
Dataset updated
Jun 28, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matthew R. Carbone; Matthew R. Carbone; Deyu Lu; Deyu Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data

22-12-05-data: original molecular XANES data created from Ghose et al.

23-04-26-ml-data: machine learning-ready data which is prepared in the format required by Crescendo.

23-05-03-hp: hyper-parameter tuning results from 23-04-26-ml-data.

23-05-05-ensembles: ensemble results from 23-04-26-ml-data.

23-05-11-ml-data-CUTOFF8: a special machine learning-ready dataset constructed by a unique partitioning: only molecules with less than or equal to 8 atoms/molecule are used for training/validation, the rest are used for testing.

Funding

This research is based upon work supported by the U.S. Department of Energy, Office of Science, Office Basic Energy Sciences, under Award Number FWP PS-030. This research also used theory and computational resources of the Center for Functional Nanomaterials, which is a U.S. Department of Energy Office of Science User Facility, and the Scientific Data and Computing Center, a component of the Computational Science Initiative, at Brookhaven National Laboratory under Contract No. DE-SC0012704.
Technical Debt identification in Issue Trackers using Natural Language...
zenodo.org
explore.openaire.eu
bin
Updated Mar 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AAAAAA; BBBBB; CCCC; DDD; AAAAAA; BBBBB; CCCC; DDD (2023). Technical Debt identification in Issue Trackers using Natural Language Processing based on Transformers [Dataset]. http://doi.org/10.5281/zenodo.7221631
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7221631
Dataset updated
Mar 1, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
AAAAAA; BBBBB; CCCC; DDD; AAAAAA; BBBBB; CCCC; DDD
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In order to ensure transparency and reproducibility, we have made everything available publicly here, including the Code, Models, Datasets and more. All the files and their functionality used in this paper are explained clearly in the README.md file.

Background: Technical Debt (TD) needs to be controlled and tracked during software development. Current support, such as static analysis tools and even ML-based automatic tagging, is still ineffective, especially for context-dependent TD.

Aim: We study the usage of a large TD dataset in combination with cutting-edge Natural Language Processing (NLP) approaches to classify TD automatically in issue trackers, allowing the identification and tracking of informal TD conversations.

Method: We mine and analyse more than 160GB of textual data from GitHub projects, collecting over 55,600 TD issues and consolidating them into a large dataset (GTD-dataset). We then use our dataset to train state-of-the-art Transformer ML models, before performing a quantitative case study on three projects and evaluating the performance metrics during inference. Additionally, we study the adaptation of our model to classify context-dependent TD in an unseen project, by retraining the model including different percentages of the TD issues in the target project.

Results: (i) We provide GTD- dataset, the most comprehensive datasets of TD issues to date, including issues from 6,401 unique public repositories with various contexts;

(ii) By training state-of-the-art Transformers using the GTD-dataset, we achieve performance metrics that outperform previous approaches;

(iii) We show that our model can provide a relatively reliable tool to classify automatically TD in issue trackers, especially when adapted to unseen projects where the training includes a small portion of TD issues in the new project.

Conclusion: Our results indicate that we have taken significant steps towards closing the gap to practically and semi-automatically track TD issues in issue trackers.
f
Data from: Machine Learning-Assisted QSAR Models on Contaminant Reactivity...
datasetcatalog.nlm.nih.gov
acs.figshare.com
Updated Dec 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhong, Shifa; Zhang, Yanping; Zhang, Huichun (2021). Machine Learning-Assisted QSAR Models on Contaminant Reactivity Toward Four Oxidants: Combining Small Data Sets and Knowledge Transfer [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000902839
Explore at:
Dataset updated
Dec 15, 2021
Authors
Zhong, Shifa; Zhang, Yanping; Zhang, Huichun
Description
To develop predictive models for the reactivity of organic contaminants toward four oxidantsSO4•–, HClO, O3, and ClO2all with small sample sizes, we proposed two approaches: combining small data sets and transferring knowledge between them. We first merged these data sets and developed a unified model using machine learning (ML), which showed better predictive performance than the individual models for HClO (RMSEtest: 2.1 to 2.04), O3 (2.06 to 1.94), ClO2 (1.77 to 1.49), and SO4•– (0.75 to 0.70) because the model “corrected” the wrongly learned effects of several atom groups. We further developed knowledge transfer models for three pairs of the data sets and observed different predictive performances: improved for O3 (RMSEtest: 2.06 to 2.01)/HClO (2.10 to 1.98), mixed for O3 (2.06 to 2.01)/ClO2 (1.77 to 1.95), and unchanged for ClO2 (1.77 to 1.77)/HClO (2.1 to 2.1). The effectiveness of the latter approach depended on whether there was consistent knowledge shared between the data sets and on the performance of the individual models. We also compared our approaches with multitask learning and image-based transfer learning and found that our approaches consistently improved the predictive performance for all data sets while the other two did not. This study demonstrated the effectiveness of combining small, similar data sets and transferring knowledge between them to improve ML model performance.
d
Airfoil Computational Fluid Dynamics - 2k shapes, 25 AoA's, 3 Re numbers
catalog.data.gov
data.openei.org
+2more
Updated Jan 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (NREL) (2024). Airfoil Computational Fluid Dynamics - 2k shapes, 25 AoA's, 3 Re numbers [Dataset]. https://catalog.data.gov/dataset/airfoil-computational-fluid-dynamics-2k-shapes-25-aoas-3-re-numbers
Explore at:
Dataset updated
Jan 3, 2024
Dataset provided by
National Renewable Energy Laboratory (NREL)
Description
This dataset contains aerodynamic quantities - including flow field values (momentum, energy, and vorticity) and summary values (coefficients of lift, drag, and momentum) - for 1,830 airfoil shapes computed using the HAM2D CFD (computational fluid dynamics) model. The airfoil shapes were designed using the separable shape tensor parameterization that encodes two-dimensional shapes as elements of the Grassmann manifold. This data-driven approach learns two independent spaces of parameter from a collection of sample airfoils. The first captures large-scale, linear perturbations, and the second defines small-scale, higher-order perturbations. For this dataset, we used the G2Aero database of over 19,000 airfoil shapes to learn a parameter space that captured a wide array of shape characteristics. We sampled airfoil designs over both parameter spaces to explore the full range of possible shape variations. The aerodynamic quantities for the generated airfoil were obtained using the HAM2D code, which is a finite-volume Reynolds-averaged Navier-Stokes (RANS) flow solver. We employ a fifth-order WENO scheme for spatial reconstruction with Roe's flux difference scheme for inviscid flux and second-order central differencing for viscous flux. A preconditioned GMRES method is applied for implicit integration. The Spalart-Allmaras 1-eq turbulence model is used for the turbulence closure, and the Medida-Baeder 2-eq transition model is applied to account for the effects of laminar turbulent transition. The airfoil grid is generated with a total of 400 points on the airfoil surface, the initial wall-normal spacing of y+ = 1, and an outer boundary located at 300 chord lengths away from the wall. The CFD simulations are performed at a freestream Mach number of 0.1, for or three different Reynolds' numbers (3M, 6M, and 9M), and for 25 angles of attack from -4 deg. to 20 deg. with 1 degree increments. Across all these various parameters, this dataset includes the results from over 250,000 CFD simulations. The simulations were performed using the Bridges-2 system at the Pittsburgh Supercomputing Center in February 2023 as part of the INTEGRATE project funded by the Advanced Research Projects Agency - Energy, in the U.S. Department of Energy. The data was collected, reformatted, and preprocessed for this OEDI submission in July 2023 under the Foundational AI for Wind Energy project funded by the U.S. Department of Energy Wind Energy Technologies Office. This dataset is intended to serve as a benchmark against which new artificial intelligence (AI) or machine learning (ML) tools may be tested. Baseline AI/ML methods for analyzing this dataset have been implemented, and a link to their repository containing those models has been provided. The .h5 data file structure can be found in the GitHub Repository resource under explore_airfoil_2k_data.ipynb.
m
Data from: Modeling the ferroelectric phase transition in barium titanate...
archive.materialscloud.org
text/markdown, zip
Updated Apr 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lorenzo Gigli; Alexander Goscinski; Michele Ceriotti; Gareth A. Tribello; Lorenzo Gigli; Alexander Goscinski; Michele Ceriotti; Gareth A. Tribello (2024). Modeling the ferroelectric phase transition in barium titanate with DFT accuracy and converged sampling [Dataset]. http://doi.org/10.24435/materialscloud:xw-g5
Explore at:
text/markdown, zipAvailable download formats
Unique identifier
https://doi.org/10.24435/materialscloud:xw-g5
Dataset updated
Apr 10, 2024
Dataset provided by
Materials Cloud
Authors
Lorenzo Gigli; Alexander Goscinski; Michele Ceriotti; Gareth A. Tribello; Lorenzo Gigli; Alexander Goscinski; Michele Ceriotti; Gareth A. Tribello
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The accurate description of the structural and thermodynamic properties of ferroelectrics has been one of the most remarkable achievements of Density Functional Theory (DFT). However, running large simulation cells with DFT is computationally demanding, while simulations of small cells are often plagued with non-physical effects that are a consequence of the system's finite size. Therefore, one is often forced to use empirical models that describe the physics of the material in terms of effective interaction terms, that are fitted using the results from DFT, to perform simulations that do not suffer from finite size effects. In this study we use a machine-learning (ML) potential trained on DFT, in combination with accelerated sampling techniques, to converge the thermodynamic properties of Barium Titanate (BTO) with first-principles accuracy and a full atomistic description. Our results indicate that the predicted Curie temperature depends strongly on the choice of DFT functional and system size, due to the presence of emergent long-range directional correlations in the local dipole fluctuations. Our findings demonstrate how the combination of ML models and traditional bottom-up modeling allow one to investigate emergent phenomena with the accuracy of first-principles calculations and the large size and time scales afforded by empirical models.
R
Animal Health Machine Learning Market Market Research Report 2033
researchintelo.com
csv, pdf, pptx
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Intelo (2025). Animal Health Machine Learning Market Market Research Report 2033 [Dataset]. https://researchintelo.com/report/animal-health-machine-learning-market-market
Explore at:
pptx, pdf, csvAvailable download formats
Dataset updated
Jul 24, 2025
Dataset authored and provided by
Research Intelo
License
https://researchintelo.com/privacy-and-policyhttps://researchintelo.com/privacy-and-policy
Time period covered
2024 - 2033
Area covered
Global
Description
Animal Health Machine Learning Market Outlook

According to our latest research, the global Animal Health Machine Learning market size reached USD 1.42 billion in 2024, demonstrating robust momentum driven by technological advancements and increasing demand for intelligent animal health solutions. The market is projected to grow at a CAGR of 17.3% from 2025 to 2033, reaching a forecasted value of USD 6.14 billion by 2033. This exceptional growth trajectory is primarily fueled by the rising adoption of machine learning (ML) technologies in veterinary diagnostics, disease surveillance, and precision livestock management, enabling data-driven decision-making and improved animal welfare across the globe.

One of the primary growth factors for the Animal Health Machine Learning market is the increasing prevalence of zoonotic diseases and the subsequent need for advanced diagnostic and monitoring tools. Machine learning algorithms are revolutionizing the way veterinarians and animal health professionals detect, diagnose, and manage diseases in both companion and livestock animals. By leveraging vast datasets from electronic health records, imaging, and biosensors, ML models can identify subtle patterns and predict disease outbreaks with remarkable accuracy. This capability is especially critical in preventing the spread of infectious diseases, reducing economic losses in the livestock industry, and ensuring food safety. Furthermore, the integration of ML with remote monitoring devices and wearable sensors is enabling continuous health surveillance, early intervention, and personalized treatment plans, marking a paradigm shift in animal healthcare management.

Another significant driver is the expanding role of precision livestock farming, which relies heavily on machine learning to optimize productivity, resource utilization, and animal welfare. As the global demand for animal protein rises, farmers and producers are increasingly adopting ML-powered solutions to monitor herd health, track behavioral changes, and automate feeding and breeding processes. These technologies not only enhance operational efficiency but also contribute to sustainable farming practices by minimizing the use of antibiotics and reducing environmental impact. Additionally, ML-driven drug discovery platforms are accelerating the development of novel therapeutics and vaccines for animal diseases, shortening research timelines and improving success rates. The growing collaboration between technology providers, research institutes, and veterinary organizations is further catalyzing innovation and expanding the application landscape of machine learning in animal health.

The market's growth is also underpinned by favorable regulatory frameworks, increased investments in veterinary informatics, and the rising awareness of animal welfare among consumers and stakeholders. Governments and industry bodies across North America, Europe, and Asia Pacific are actively promoting digital transformation in the animal health sector through grants, pilot projects, and public-private partnerships. The proliferation of cloud-based platforms and advancements in big data analytics are making ML solutions more accessible and scalable, even for small and medium-sized animal farms. Despite challenges such as data privacy concerns and the need for skilled professionals, the overall outlook for the Animal Health Machine Learning market remains highly optimistic, with significant opportunities for innovation and value creation in the coming decade.

From a regional perspective, North America currently dominates the Animal Health Machine Learning market, accounting for a substantial share of global revenues, followed by Europe and Asia Pacific. The United States, in particular, is a frontrunner due to its advanced veterinary infrastructure, high adoption of digital health technologies, and strong presence of leading market players. Europe is witnessing rapid growth driven by stringent animal welfare regulations and increased research funding, while Asia Pacific is emerging as a lucrative market owing to its large livestock population, rising pet ownership, and government initiatives to modernize the agricultural sector. Latin America and the Middle East & Africa are gradually catching up, supported by improving veterinary services and growing awareness of the benefits of ML-based animal health solutions.

Component Analysis

The Animal Health Machine Learning market is segmented by co
f
DataSheet1_AI Models to Assist Vancomycin Dosage Titration.docx
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Feb 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fu, Zhiyan; Wang, Zhiyu; Ong, Chiat Ling Jasmine (2022). DataSheet1_AI Models to Assist Vancomycin Dosage Titration.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000417974
Explore at:
Dataset updated
Feb 8, 2022
Authors
Fu, Zhiyan; Wang, Zhiyu; Ong, Chiat Ling Jasmine
Description
Background: Effective treatment using antibiotic vancomycin requires close monitoring of serum drug levels due to its narrow therapeutic index. In the current practice, physicians use various dosing algorithms for dosage titration, but these algorithms reported low success in achieving therapeutic targets. We explored using artificial intelligent to assist vancomycin dosage titration.Methods: We used a novel method to generate the label for each record and only included records with appropriate label data to generate a clean cohort with 2,282 patients and 7,912 injection records. Among them, 64% of patients were used to train two machine learning models, one for initial dose recommendation and another for subsequent dose recommendation. The model performance was evaluated using two metrics: PAR, a pharmacology meaningful metric defined by us, and Mean Absolute Error (MAE), a commonly used regression metric.Results: In our 3-year data, only a small portion (34.1%) of current injection doses could reach the desired vancomycin trough level (14–20 mcg/ml). Both PAR and MAE of our machine learning models were better than the classical pharmacokinetic models. Our model also showed better performance than the other previously developed machine learning models in our test data.Conclusion: We developed machine learning models to recommend vancomycin dosage. Our results show that the new AI-assisted dosage titration approach has the potential to improve the traditional approaches. This is especially useful to guide decision making for inexperienced doctors in making consistent and safe dosing recommendations for high-risk medications like vancomycin.
f
Table_1_Assessment of fractional flow reserve in intermediate coronary...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Jan 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lee, Yong-Joon; Hong, Sung-Jin; Jang, Yangsoo; Hong, Myeong-Ki; Lee, Seung-Jun; Ko, Young-Guk; Kim, Jung-Sun; Tran, Cong; Nguyen, Ngoc-Luu; Ha, Jinyong; Lee, Seul-Gee; Shin, Won-Yong; Kim, Byeong-Keuk; Cha, Jung-Joon; Ahn, Chul-Min; Choi, Donghoon (2023). Table_1_Assessment of fractional flow reserve in intermediate coronary stenosis using optical coherence tomography-based machine learning.DOCX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001004289
Explore at:
Dataset updated
Jan 25, 2023
Authors
Lee, Yong-Joon; Hong, Sung-Jin; Jang, Yangsoo; Hong, Myeong-Ki; Lee, Seung-Jun; Ko, Young-Guk; Kim, Jung-Sun; Tran, Cong; Nguyen, Ngoc-Luu; Ha, Jinyong; Lee, Seul-Gee; Shin, Won-Yong; Kim, Byeong-Keuk; Cha, Jung-Joon; Ahn, Chul-Min; Choi, Donghoon
Description
ObjectivesThis study aimed to evaluate and compare the diagnostic accuracy of machine learning (ML)- fractional flow reserve (FFR) based on optical coherence tomography (OCT) with wire-based FFR irrespective of the coronary territory.BackgroundML techniques for assessing hemodynamics features including FFR in coronary artery disease have been developed based on various imaging modalities. However, there is no study using OCT-based ML models for all coronary artery territories.MethodsOCT and FFR data were obtained for 356 individual coronary lesions in 130 patients. The training and testing groups were divided in a ratio of 4:1. The ML-FFR was derived for the testing group and compared with the wire-based FFR in terms of the diagnosis of ischemia (FFR ≤ 0.80).ResultsThe mean age of the subjects was 62.6 years. The numbers of the left anterior descending, left circumflex, and right coronary arteries were 130 (36.5%), 110 (30.9%), and 116 (32.6%), respectively. Using seven major features, the ML-FFR showed strong correlation (r = 0.8782, P < 0.001) with the wire-based FFR. The ML-FFR predicted wire-based FFR ≤ 0.80 in the test set with sensitivity of 98.3%, specificity of 61.5%, and overall accuracy of 91.7% (area under the curve: 0.948). External validation showed good correlation (r = 0.7884, P < 0.001) and accuracy of 83.2% (area under the curve: 0.912).ConclusionOCT-based ML-FFR showed good diagnostic performance in predicting FFR irrespective of the coronary territory. Because the study was a small-size study, the results should be warranted the performance in further large-scale research.
Insect Detect - insect classification dataset v2
zenodo.org
data.niaid.nih.gov
zip
Updated Dec 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maximilian Sittinger; Maximilian Sittinger; Johannes Uhler; Johannes Uhler; Maximilian Pink; Maximilian Pink (2023). Insect Detect - insect classification dataset v2 [Dataset]. http://doi.org/10.5281/zenodo.8325384
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8325384
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Maximilian Sittinger; Maximilian Sittinger; Johannes Uhler; Johannes Uhler; Maximilian Pink; Maximilian Pink
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
The Insect Detect - insect classification dataset v2 contains mainly images of various insects sitting on or flying above an artificial flower platform. All images were automatically recorded with the Insect Detect DIY camera trap, a hardware combination of the Luxonis OAK-1, Raspberry Pi Zero 2 W and PiJuice Zero pHAT for automated insect monitoring (bioRxiv preprint).
Most of the images were captured by camera traps deployed at different sites in 2023. For some classes (e.g. ant, bee_bombus, beetle_cocci, bug, bug_grapho, hfly_eristal, hfly_myathr, hfly_syrphus) additional images were captured with a lab setup of the camera trap. For some classes (e.g. bee_apis, fly, hfly_episyr, wasp) images from the first dataset version were transferred to this dataset.
This dataset is also available on Roboflow Universe. The images in the dataset from Roboflow are automatically compressed, which decreases model accuracy when used for training. Therefore it is recommended to use this uncompressed Zenodo version and split the dataset into train/val/test subsets in the provided training notebook.

Classes
This dataset contains the following 27 classes:
ant (Formicidae)
bee (Anthophila excluding Apis mellifera and Bombus sp.)
bee_apis (Apis mellifera)
bee_bombus (Bombus sp.)
beetle (Coleoptera excluding Coccinellidae and some Oedemeridae)
beetle_cocci (Coccinellidae)
beetle_oedem (visually distinct Oedemeridae)
bug (Heteroptera excluding Graphosoma italicum)
bug_grapho (Graphosoma italicum)
fly (Brachycera excluding Empididae, Sarcophagidae, Syrphidae and small Brachycera)
fly_empi (Empididae)
fly_sarco (visually distinct Sarcophagidae)
fly_small (small Brachycera)
hfly_episyr (hoverfly Episyrphus balteatus)
hfly_eristal (hoverfly Eristalis sp., mainly Eristalis tenax)
hfly_eupeo (mainly hoverfly Eupeodes corollae and Scaeva pyrastri)
hfly_myathr (hoverfly Myathropa florea)
hfly_sphaero (hoverfly Sphaerophoria sp., mainly Sphaerophoria scripta)
hfly_syrphus (mainly hoverfly Syrphus sp.)
lepi (Lepidoptera)
none_bg (images with no insect - background (platform))
none_bird (images with no insect - bird sitting on platform)
none_dirt (images with no insect - leaves and other plant material, bird droppings)
none_shadow (images with no insect - shadows of insects or surrounding plants)
other (other Arthropods, including various Hymenoptera and Symphyta, Diptera, Orthoptera, Auchenorrhyncha, Neuroptera, Araneae)
scorpionfly (Panorpa sp.)
wasp (mainly Vespula sp. and Polistes dominula)
For the classes hfly_eupeo and hfly_syrphus a precise taxonomic distinction is not possible with images only, due to a potentially high variability in the appearance of the respective species. While most specimens will show the visual features that are important for a classification into one of these classes, some specimens of Syrphus sp. might look more like Eupeodes sp. and vice versa.
The images were sorted to the respective class by considering taxonomic and visual distinctions. However, this dataset is still rather small regarding the visually extremely diverse Insecta. Insects that are not included in this dataset can therefore be classified to the wrong class. All results should always be manually validated and false classifications can be used to extend this basic dataset and retrain your custom classification model.

Deployment
You can use this dataset as starting point to train your own insect classification models with the provided Google Colab training notebook. Read the model training instructions for more information.
A insect classification model trained on this dataset is available in the insect-detect-ml GitHub repo. To deploy the model on your PC (ONNX format for fast CPU inference), follow the provided step-by-step instructions.

License
This dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
Data from: Resolving Transition Metal Chemical Space: Feature Selection for...
acs.figshare.com
zip
Updated Jun 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jon Paul Janet; Heather J. Kulik (2023). Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure–Property Relationships [Dataset]. http://doi.org/10.1021/acs.jpca.7b08750.s003
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jpca.7b08750.s003
Dataset updated
Jun 18, 2023
Dataset provided by
ACS Publications
Authors
Jon Paul Janet; Heather J. Kulik
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15–20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4–5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal–ligand bond length prediction (0.004–5 Å MUE) and redox potential on a smaller data set (0.2–0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.
f
Model performance results based on random forest, gradient boosting,...
figshare.com
xls
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junying Wang; David D. Wu; Christine DeLorenzo; Jie Yang (2024). Model performance results based on random forest, gradient boosting, penalized logistic regression, XGBoost, SVM, neural network, and stacking for EMBARC data as training set and APAT data as testing set after multiple imputation for 10 times. [Dataset]. http://doi.org/10.1371/journal.pone.0299625.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0299625.t006
Dataset updated
Mar 28, 2024
Dataset provided by
PLOS ONE
Authors
Junying Wang; David D. Wu; Christine DeLorenzo; Jie Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Model performance results based on random forest, gradient boosting, penalized logistic regression, XGBoost, SVM, neural network, and stacking for EMBARC data as training set and APAT data as testing set after multiple imputation for 10 times.
f
Summary of the negative datasets generated by different methods.
plos.figshare.com
xls
Updated Sep 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Efrat Cohen-Davidi; Isana Veksler-Lublinsky (2024). Summary of the negative datasets generated by different methods. [Dataset]. http://doi.org/10.1371/journal.pcbi.1012385.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1012385.t003
Dataset updated
Sep 6, 2024
Dataset provided by
PLOS Computational Biology
Authors
Efrat Cohen-Davidi; Isana Veksler-Lublinsky
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The method numbers correspond to method numbers in Fig 1. FPD denotes the full-positive-dataset.
f
SentiGAN configuration parameters.
plos.figshare.com
xls
Updated Sep 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda (2024). SentiGAN configuration parameters. [Dataset]. http://doi.org/10.1371/journal.pone.0310707.t010
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310707.t010
Dataset updated
Sep 26, 2024
Dataset provided by
PLOS ONE
Authors
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Over the last ten years, social media has become a crucial data source for businesses and researchers, providing a space where people can express their opinions and emotions. To analyze this data and classify emotions and their polarity in texts, natural language processing (NLP) techniques such as emotion analysis (EA) and sentiment analysis (SA) are employed. However, the effectiveness of these tasks using machine learning (ML) and deep learning (DL) methods depends on large labeled datasets, which are scarce in languages like Spanish. To address this challenge, researchers use data augmentation (DA) techniques to artificially expand small datasets. This study aims to investigate whether DA techniques can improve classification results using ML and DL algorithms for sentiment and emotion analysis of Spanish texts. Various text manipulation techniques were applied, including transformations, paraphrasing (back-translation), and text generation using generative adversarial networks, to small datasets such as song lyrics, social media comments, headlines from national newspapers in Chile, and survey responses from higher education students. The findings show that the Convolutional Neural Network (CNN) classifier achieved the most significant improvement, with an 18% increase using the Generative Adversarial Networks for Sentiment Text (SentiGan) on the Aggressiveness (Seriousness) dataset. Additionally, the same classifier model showed an 11% improvement using the Easy Data Augmentation (EDA) on the Gender-Based Violence dataset. The performance of the Bidirectional Encoder Representations from Transformers (BETO) also improved by 10% on the back-translation augmented version of the October 18 dataset, and by 4% on the EDA augmented version of the Teaching survey dataset. These results suggest that data augmentation techniques enhance performance by transforming text and adapting it to the specific characteristics of the dataset. Through experimentation with various augmentation techniques, this research provides valuable insights into the analysis of subjectivity in Spanish texts and offers guidance for selecting algorithms and techniques based on dataset features.
f
Performance of ML models on test data.
plos.figshare.com
xls
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha (2023). Performance of ML models on test data. [Dataset]. http://doi.org/10.1371/journal.pgph.0002475.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pgph.0002475.t005
Dataset updated
Oct 31, 2023
Dataset provided by
PLOS Global Public Health
Authors
Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Vitamin D insufficiency appears to be prevalent in SLE patients. Multiple factors potentially contribute to lower vitamin D levels, including limited sun exposure, the use of sunscreen, darker skin complexion, aging, obesity, specific medical conditions, and certain medications. The study aims to assess the risk factors associated with low vitamin D levels in SLE patients in the southern part of Bangladesh, a region noted for a high prevalence of SLE. The research additionally investigates the possible correlation between vitamin D and the SLEDAI score, seeking to understand the potential benefits of vitamin D in enhancing disease outcomes for SLE patients. The study incorporates a dataset consisting of 50 patients from the southern part of Bangladesh and evaluates their clinical and demographic data. An initial exploratory data analysis is conducted to gain insights into the data, which includes calculating means and standard deviations, performing correlation analysis, and generating heat maps. Relevant inferential statistical tests, such as the Student’s t-test, are also employed. In the machine learning part of the analysis, this study utilizes supervised learning algorithms, specifically Linear Regression (LR) and Random Forest (RF). To optimize the hyperparameters of the RF model and mitigate the risk of overfitting given the small dataset, a 3-Fold cross-validation strategy is implemented. The study also calculates bootstrapped confidence intervals to provide robust uncertainty estimates and further validate the approach. A comprehensive feature importance analysis is carried out using RF feature importance, permutation-based feature importance, and SHAP values. The LR model yields an RMSE of 4.83 (CI: 2.70, 6.76) and MAE of 3.86 (CI: 2.06, 5.86), whereas the RF model achieves better results, with an RMSE of 2.98 (CI: 2.16, 3.76) and MAE of 2.68 (CI: 1.83,3.52). Both models identify Hb, CRP, ESR, and age as significant contributors to vitamin D level predictions. Despite the lack of a significant association between SLEDAI and vitamin D in the statistical analysis, the machine learning models suggest a potential nonlinear dependency of vitamin D on SLEDAI. These findings highlight the importance of these factors in managing vitamin D levels in SLE patients. The study concludes that there is a high prevalence of vitamin D insufficiency in SLE patients. Although a direct linear correlation between the SLEDAI score and vitamin D levels is not observed, machine learning models suggest the possibility of a nonlinear relationship. Furthermore, factors such as Hb, CRP, ESR, and age are identified as more significant in predicting vitamin D levels. Thus, the study suggests that monitoring these factors may be advantageous in managing vitamin D levels in SLE patients. Given the immunological nature of SLE, the potential role of vitamin D in SLE disease activity could be substantial. Therefore, it underscores the need for further large-scale studies to corroborate this hypothesis.
Data from: Identification of High-Reliability Regions of Machine Learning...
acs.figshare.com
figshare.com
txt
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evan M. Askenazi; Emanuel A. Lazar; Ilya Grinberg (2023). Identification of High-Reliability Regions of Machine Learning Predictions Based on Materials Chemistry [Dataset]. http://doi.org/10.1021/acs.jcim.3c01684.s006
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.3c01684.s006
Dataset updated
Nov 20, 2023
Dataset provided by
ACS Publications
Authors
Evan M. Askenazi; Emanuel A. Lazar; Ilya Grinberg
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Progress in the application of machine learning (ML) methods to materials design is hindered by the lack of understanding of the reliability of ML predictions, in particular, for the application of ML to small data sets often found in materials science. Using ML prediction for transparent conductor oxide formation energy and band gap, dilute solute diffusion, and perovskite formation energy, band gap, and lattice parameter as examples, we demonstrate that (1) construction of a convex hull in feature space that encloses accurately predicted systems can be used to identify regions in feature space for which ML predictions are highly reliable; (2) analysis of the systems enclosed by the convex hull can be used to extract physical understanding; and (3) materials that satisfy all well-known chemical and physical principles that make a material physically reasonable are likely to be similar and show strong relationships between the properties of interest and the standard features used in ML. We also show that similar to the composition–structure–property relationships, inclusion in the ML training data set of materials from classes with different chemical properties will not be beneficial for the accuracy of ML prediction and that reliable results likely will be obtained by ML model for narrow classes of similar materials even in the case where the ML model will show large errors on the data set consisting of several classes of materials.

Facebook

Twitter

Click to copy link

Link copied

Cite

Susan Hwang; Mark Chang (2023). Similarity-Principle-Based Machine Learning Method for Clinical Trials and Beyond [Dataset]. http://doi.org/10.6084/m9.figshare.20272392.v2

Data from: Similarity-Principle-Based Machine Learning Method for Clinical Trials and Beyond

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.20272392.v2

Dataset updated

May 31, 2023

Dataset provided by

Taylor & Francis

Authors

Susan Hwang; Mark Chang

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

With recent success in supervised learning, artificial intelligence (AI) and machine learning (ML) can play a vital role in precision medicine. Deep learning neural networks have been used in drug discovery when larger data is available. However, applications of machine learning in clinical trials with small sample size (around a few hundreds) are limited. We propose a Similarity-Principle-Based Machine Learning (SBML) method, which is applicable for small and large sample size problems. In SBML, the attribute-scaling factors are introduced to objectively determine the relative importance of each attribute (predictor). The gradient method is used in learning (training), that is, updating the attribute-scaling factors. We evaluate SBML when the sample size is small and investigate the effects of tuning parameters. Simulations show that SBML achieves better predictions in terms of mean squared errors for various complicated nonlinear situations than full linear models, optimal and ridge regressions, mixed effect models, support vector machine and decision tree methods.

Clear search

Close search

Google apps

Main menu

Data from: Similarity-Principle-Based Machine Learning Method for Clinical...

EDA augmentation parameters.

Soil Types

Context

Content

Acknowledgements

Inspiration

UiT_TILs - Replication Data for \"A Pragmatic Machine Learning Approach to...

Clay, Viviane (2021). Dataset: Data from neural network training in the...

Neural network ensembles and FEFF spectra for multi-modal small molecule...

Technical Debt identification in Issue Trackers using Natural Language...

Data from: Machine Learning-Assisted QSAR Models on Contaminant Reactivity...

Airfoil Computational Fluid Dynamics - 2k shapes, 25 AoA's, 3 Re numbers

Data from: Modeling the ferroelectric phase transition in barium titanate...

Animal Health Machine Learning Market Market Research Report 2033

Animal Health Machine Learning Market Outlook

Component Analysis

DataSheet1_AI Models to Assist Vancomycin Dosage Titration.docx

Table_1_Assessment of fractional flow reserve in intermediate coronary...

Insect Detect - insect classification dataset v2

Classes

Deployment

License

Data from: Resolving Transition Metal Chemical Space: Feature Selection for...

Model performance results based on random forest, gradient boosting,...

Summary of the negative datasets generated by different methods.

SentiGAN configuration parameters.

Performance of ML models on test data.

Data from: Identification of High-Reliability Regions of Machine Learning...

Data from: Similarity-Principle-Based Machine Learning Method for Clinical Trials and Beyond