18 datasets found

Packages Object Detection Dataset - augmented-v1
public.roboflow.com
zip
Updated Jan 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow Community (2021). Packages Object Detection Dataset - augmented-v1 [Dataset]. https://public.roboflow.com/object-detection/packages-dataset/5
Explore at:
zipAvailable download formats
Dataset updated
Jan 14, 2021
Dataset provided by
Roboflow, Inc.
Authors
Roboflow Community
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Bounding Boxes of packages
Description
About This Dataset

The Roboflow Packages dataset is a collection of packages located at the doors of various apartments and homes. Packages are flat envelopes, small boxes, and large boxes. Some images contain multiple annotated packages.

Usage

This dataset may be used as a good starter dataset to track and identify when a package has been delivered to a home. Perhaps you want to know when a package arrives to claim it quickly or prevent package theft.

If you plan to use this dataset and adapt it to your own front door, it is recommended that you capture and add images from the context of your specific camera position. You can easily add images to this dataset via the web UI or via the Roboflow Upload API.

About Roboflow

Roboflow enables teams to build better computer vision models faster. We provide tools for image collection, organization, labeling, preprocessing, augmentation, training and deployment. :fa-spacer: Developers reduce boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:
f
Supplementary file 1_Data augmented lung cancer prediction framework using...
frontiersin.figshare.com
docx
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yifan Jiang; Venkata S. K. Manem (2025). Supplementary file 1_Data augmented lung cancer prediction framework using the nested case control NLST cohort.docx [Dataset]. http://doi.org/10.3389/fonc.2025.1492758.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fonc.2025.1492758.s001
Dataset updated
Feb 25, 2025
Dataset provided by
Frontiers
Authors
Yifan Jiang; Venkata S. K. Manem
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PurposeIn the context of lung cancer screening, the scarcity of well-labeled medical images poses a significant challenge to implement supervised learning-based deep learning methods. While data augmentation is an effective technique for countering the difficulties caused by insufficient data, it has not been fully explored in the context of lung cancer screening. In this research study, we analyzed the state-of-the-art (SOTA) data augmentation techniques for lung cancer binary prediction.MethodsTo comprehensively evaluate the efficiency of data augmentation approaches, we considered the nested case control National Lung Screening Trial (NLST) cohort comprising of 253 individuals who had the commonly used CT scans without contrast. The CT scans were pre-processed into three-dimensional volumes based on the lung nodule annotations. Subsequently, we evaluated five basic (online) and two generative model-based offline data augmentation methods with ten state-of-the-art (SOTA) 3D deep learning-based lung cancer prediction models.ResultsOur results demonstrated that the performance improvement by data augmentation was highly dependent on approach used. The Cutmix method resulted in the highest average performance improvement across all three metrics: 1.07%, 3.29%, 1.19% for accuracy, F1 score and AUC, respectively. MobileNetV2 with a simple data augmentation approach achieved the best AUC of 0.8719 among all lung cancer predictors, demonstrating a 7.62% improvement compared to baseline. Furthermore, the MED-DDPM data augmentation approach was able to improve prediction performance by rebalancing the training set and adding moderately synthetic data.ConclusionsThe effectiveness of online and offline data augmentation methods were highly sensitive to the prediction model, highlighting the importance of carefully selecting the optimal data augmentation method. Our findings suggest that certain traditional methods can provide more stable and higher performance compared to SOTA online data augmentation approaches. Overall, these results offer meaningful insights for the development and clinical integration of data augmented deep learning tools for lung cancer screening.
f
Data from: Augmentation of telemedicine post-operative follow-up after...
datasetcatalog.nlm.nih.gov
Updated Aug 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vagefi, M. Reza; Grob, Seanna R.; Ahmad, Meleha; Winn, Bryan J.; Smith, Loreley D.; Ashraf, Davin C.; Kersten, Robert C.; Miller, Amanda (2022). Augmentation of telemedicine post-operative follow-up after oculofacial plastic surgery with a self-guided patient tool [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000249833
Explore at:
Dataset updated
Aug 3, 2022
Authors
Vagefi, M. Reza; Grob, Seanna R.; Ahmad, Meleha; Winn, Bryan J.; Smith, Loreley D.; Ashraf, Davin C.; Kersten, Robert C.; Miller, Amanda
Description
This study evaluates a web-based tool designed to augment telemedicine post-operative visits after periocular surgery. Adult, English-speaking patients undergoing periocular surgery with telemedicine follow-up were studied prospectively in this interventional case series. Participants submitted visual acuity measurements and photographs via a web-based tool prior to routine telemedicine post-operative visits. An after-visit survey assessed patient perceptions. Surgeons rated photographs and live video for quality and blurriness; external raters also evaluated photographs. Images were analyzed for facial centration, resolution, and algorithmically detected blur. Complications were recorded and graded for severity and relation to telemedicine. Seventy-nine patients were recruited. Surgeons requested an in-person assessment for six patients (7.6%) due to inadequate evaluation by telemedicine. Surgeons rated patient-provided photographs to be of higher quality than live video at the time of the post-operative visit (p < 0.001). Image blur and resolution had moderate and weak correlation with photograph quality, respectively. A photograph blur detection algorithm demonstrated sensitivity of 85.5% and specificity of 75.1%. One patient experienced a wound dehiscence with a possible relationship to inadequate evaluation during telemedicine follow-up. Patients rated the telemedicine experience and their comfort with the structure of the visit highly. Augmented telemedicine follow-up after oculofacial plastic surgery is associated with high patient satisfaction, rare conversion to clinic evaluation, and few related post-operative complications. Automated detection of image resolution and blur may play a role in screening photographs for subsequent iterations of the web-based tool.
S
Literature collection of Text Data Augmentation
scidb.cn
Updated Aug 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Feng Ran (2024). Literature collection of Text Data Augmentation [Dataset]. http://doi.org/10.57760/sciencedb.j00133.00356
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.j00133.00356
Dataset updated
Aug 22, 2024
Dataset provided by
Science Data Bank
Authors
Feng Ran
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
A list of references obtained by searching and screening two English databases, Web of Science (WOS) and Google Scholar, as well as two Chinese databases, CNKI and Wanfang Data, using "text enhancement" as the keyword. The time range is from 2015 to 2024, including descriptions of titles, enhancement methods, categories, datasets, and tools
f
Data from: Combining Group Contribution Method and Semisupervised Learning...
acs.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhao Liu; Lanyu Shang; Kuan Huang; Zhenrui Yue; Alan Y. Han; Dong Wang; Huichun Zhang (2024). Combining Group Contribution Method and Semisupervised Learning to Build Machine Learning Models for Predicting Hydroxyl Radical Rate Constants of Water Contaminants [Dataset]. http://doi.org/10.1021/acs.est.4c11950.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.est.4c11950.s002
Dataset updated
Dec 26, 2024
Dataset provided by
ACS Publications
Authors
Zhao Liu; Lanyu Shang; Kuan Huang; Zhenrui Yue; Alan Y. Han; Dong Wang; Huichun Zhang
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Machine learning is an effective tool for predicting reaction rate constants for many organic compounds with the hydroxyl radical (HO•). Previously reported models have achieved relatively good performance, but due to scarce data (
f
Precision, recall, and F1-score for each class.
plos.figshare.com
xls
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuki Wong; Eileen Lee Ming Su; Che Fai Yeong; William Holderbaum; Chenguang Yang (2025). Precision, recall, and F1-score for each class. [Dataset]. http://doi.org/10.1371/journal.pone.0322624.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0322624.t003
Dataset updated
May 9, 2025
Dataset provided by
PLOS ONE
Authors
Yuki Wong; Eileen Lee Ming Su; Che Fai Yeong; William Holderbaum; Chenguang Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Brain tumors pose a significant medical challenge, necessitating early detection and precise classification for effective treatment. This study aims to address this challenge by introducing an automated brain tumor classification system that utilizes deep learning (DL) and Magnetic Resonance Imaging (MRI) images. The main purpose of this research is to develop a model that can accurately detect and classify different types of brain tumors, including glioma, meningioma, pituitary tumors, and normal brain scans. A convolutional neural network (CNN) architecture with pretrained VGG16 as the base model is employed, and diverse public datasets are utilized to ensure comprehensive representation. Data augmentation techniques are employed to enhance the training dataset, resulting in a total of 17,136 brain MRI images across the four classes. The accuracy of this model was 99.24%, a higher accuracy than other similar works, demonstrating its potential clinical utility. This higher accuracy was achieved mainly due to the utilization of a large and diverse dataset, the improvement of network configuration, the application of a fine-tuning strategy to adjust pretrained weights, and the implementation of data augmentation techniques in enhancing classification performance for brain tumor detection. In addition, a web application was developed by leveraging HTML and Dash components to enhance usability, allowing for easy image upload and tumor prediction. By harnessing artificial intelligence (AI), the developed system addresses the need to reduce human error and enhance diagnostic accuracy. The proposed approach provides an efficient and reliable solution for brain tumor classification, facilitating early diagnosis and enabling timely medical interventions. This work signifies a potential advancement in brain tumor classification, promising improved patient care and outcomes.
f
The model alterations description per epoch during training.
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri (2023). The model alterations description per epoch during training. [Dataset]. http://doi.org/10.1371/journal.pone.0284437.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0284437.t002
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The model alterations description per epoch during training.
Federated Learning Market Analysis, Size, and Forecast 2025-2029: North...
technavio.com
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). Federated Learning Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, and UK), APAC (China, India, and Japan), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/federated-learning-market-industry-analysis
Explore at:
Dataset updated
Jul 2, 2025
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Mexico, Brazil, Canada, Germany, United States, Global
Description
Snapshot img

Federated Learning Market Size 2025-2029

The federated learning market size is forecast to increase by USD 301.1 million at a CAGR of 15.9% between 2024 and 2029.

The market is experiencing significant growth, driven by increasing data privacy regulations and growing privacy concerns. As organizations seek to protect sensitive data while still leveraging machine learning capabilities, federated learning's decentralized approach offers an attractive solution. However, this market is not without challenges. The rise of vertical-specific federated learning platforms necessitates a focused approach for companies looking to capitalize on this technology. Machine learning algorithms, including random forests, naive Bayes, decision trees, clustering algorithms, and k-nearest neighbors, are essential tools for risk management and compliance monitoring. Companies must navigate these challenges to effectively implement federated learning and reap the rewards of improved data privacy and enhanced machine learning capabilities. By addressing these obstacles, organizations can successfully harness the power of federated learning to gain a competitive edge in their industries. Additionally, data security and privacy concerns are becoming increasingly important, requiring strong security measures to protect sensitive industrial data. Furthermore, the significant technical complexity and statistical heterogeneity associated with federated learning require specialized expertise and resources.

What will be the Size of the Federated Learning Market during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample

In the dynamic world of machine learning, federated learning is gaining traction as a promising approach for personalized learning and recommendation systems. Reinforcement learning techniques are being employed to optimize models in a decentralized manner, enhancing model robustness and adaptability. Online learning environments leverage federated analytics to deliver customized educational content, while collaboration between learners is facilitated through collaborative filtering. Identity management and privacy impact assessments are crucial components of federated learning, ensuring data lineage and provenance are maintained. Security measures, such as model robustness assessments, penetration testing, threat modeling, and encryption keys, safeguard against adversarial attacks and data breaches. Big data and AI are at the heart of this transformation, with machine learning algorithms enabling real-time monitoring and predictive maintenance of industrial machines.

Ensemble methods and key management systems contribute to improved model accuracy and access control, respectively. Decentralized identity solutions, like smart contracts and blockchain technology, provide secure and private data sharing mechanisms. Data augmentation and transfer learning techniques enable model training on distributed data, further enhancing the effectiveness of federated learning. Zero-knowledge proofs and adaptive learning algorithms enable secure and efficient knowledge sharing, fostering a collaborative learning environment. Overall, federated learning continues to evolve, offering significant potential for online learning and recommendation systems while addressing data security and privacy concerns.

How is this Federated Learning Industry segmented?

The federated learning industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

Deployment Cloud On-premises Type Horizontal federated learning Vertical federated learning Federated transfer learning End-user Healthcare BFSI Manufacturing Automotive IT and telecom Technology Federated averaging Differential privacy Homomorphic encryption Geography North America US Canada Mexico Europe France Germany UK APAC China India Japan South America Brazil Rest of World (ROW)

By Deployment Insights

The Cloud segment is estimated to witness significant growth during the forecast period. The market is witnessing significant growth, driven by the integration of various advanced technologies. Artificial intelligence, neural networks, and deep learning are at the core of this evolution, enabling model training on decentralized data without compromising data privacy and security. Edge computing plays a crucial role in this paradigm, allowing for local processing and model compression, reducing power consumption and data transmission. Data silos and privacy concerns are addressed through techniques su
f
Confusion matrix (DPC data).
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khan, Saddam Hussain; Shahzad, Abdul Raheem; Bangyal, Waqas Haider; Amin, Muhammad Awais; Ahmad, Waqar; Alahmadi, Tahani Jaser (2025). Confusion matrix (DPC data). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002060685
Explore at:
Dataset updated
Jun 18, 2025
Authors
Khan, Saddam Hussain; Shahzad, Abdul Raheem; Bangyal, Waqas Haider; Amin, Muhammad Awais; Ahmad, Waqar; Alahmadi, Tahani Jaser
Description
The prevalence of Leukaemia, a malignant blood cancer that originates from hematopoietic progenitor cells, is increasing in Southeast Asia, with a worrisome fatality rate of 54%. Predicting outcomes in the early stages is vital for improving the chances of patient recovery. The aim of this research is to enhance early-stage prediction systems in a substantial manner. Using Machine Learning and Data Science, we exploit protein sequential data from commonly altered genes including BCL2, HSP90, PARP, and RB to make predictions for Chronic Myeloid Leukaemia (CML). The methodology we implement is based on the utilisation of reliable methods for extracting features, namely Di-peptide Composition (DPC), Amino Acid Composition (AAC), and Pseudo amino acid composition (Pse-AAC). We also take into consideration the identification and handling of outliers, as well as the validation of feature selection using the Pearson Correlation Coefficient (PCA). Data augmentation guarantees a comprehensive dataset for analysis. By utilising several Machine Learning models such as Support Vector Machine (SVM), XGBoost, Random Forest (RF), K Nearest Neighbour (KNN), Decision Tree (DT), and Logistic Regression (LR), we have achieved accuracy rates ranging from 66% to 94%. These classifiers are thoroughly evaluated utilising performance criteria such as accuracy, sensitivity, specificity, F1-score, and the confusion matrix.The solution we suggest is a user-friendly online application dashboard that can be used for early detection of CML. This tool has significant implications for practitioners and may be used in healthcare institutions and hospitals.
m
Bone Fracture X-ray Dataset: Simple vs. Comminuted Fractures
data.mendeley.com
Updated Dec 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fahim Faisal Talha Talha (2024). Bone Fracture X-ray Dataset: Simple vs. Comminuted Fractures [Dataset]. http://doi.org/10.17632/vg95gvhj3y.3
Explore at:
Unique identifier
https://doi.org/10.17632/vg95gvhj3y.3
Dataset updated
Dec 3, 2024
Authors
Fahim Faisal Talha Talha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Overview: This dataset has been curated to support research in bone fracture classification, focusing on simple and comminuted fractures. It includes high-quality X-ray images and a diverse set of augmented images to facilitate the development and evaluation of machine learning models in medical imaging. The dataset is ideal for image classification, segmentation, and fracture type recognition tasks.

Fracture Categories: 1. Simple Fracture: Source: Exclusively sourced from hospital records. Original Images: 1,211 images. Augmented Images: 6,311 images. Total Images: 7,522 images. 2. Comminuted Fracture: Source: A mix of hospital-sourced images and web-sourced images (approximately one-tenth from web pages). Original Images: 1,173 images. Augmented Images: 7,366 images. Total Images: 8,539 images. Key Features: Number of Original Images: 2,384. Number of Augmented Images: 13,677. Total Dataset Size: 16,061 images (Original + Augmented). File Formats: JPG. Augmentation Techniques: Zoom: Randomized scaling. Rotation: ±30° rotation. Brightness and Contrast Adjustments: 80%–120% range. Flips: Vertical and horizontal flips.

Update Details (Version 3): Original Submission Date: 15 Nov 2024. Update Date: 02 Dec 2024.

Changes in This Version: Expanded Dataset Size: From: 6,798 total images (previous version). To: 16,061 total images.

Augmented Images: Simple Fracture: Increased from 3,280 to 6,311 images. Comminuted Fracture: Increased from 2,900 to 7,366 images. Augmentation Enhancements: Additional transformations were applied to ensure varied and realistic images for training.

Applications:

Medical Imaging: Train and evaluate models for fracture classification and identification. Healthcare Technology: Support the development of diagnostic tools and mobile applications for real-time fracture detection. Medical Research: Aid in understanding fracture patterns and their visual indicators.

Dataset Collection: Simple Fracture Images: Exclusively sourced from hospitals to ensure clinical accuracy and relevance. Comminuted Fracture Images: Sourced from a mix of hospital records and web pages, representing diverse imaging conditions and perspectives.

Realistic Scenarios: The dataset simulates real-world clinical settings with varying lighting conditions, orientations, and imaging environments to provide robust training data. This dataset is a valuable resource for advancing medical imaging research and developing machine learning models tailored for fracture detection and classification tasks.
Comparing the mean age between skin cancer lesions.
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri (2023). Comparing the mean age between skin cancer lesions. [Dataset]. http://doi.org/10.1371/journal.pone.0284437.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0284437.t001
Dataset updated
Jun 21, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparing the mean age between skin cancer lesions.
Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0)
kaggle.com
Updated Jul 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joydip Paul (2023). Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0) [Dataset]. https://www.kaggle.com/datasets/joydippaul/mpox-skin-lesion-dataset-version-20-msld-v20/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 4, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Joydip Paul
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Context

During the initial peak outbreak phase of Mpox, a significant challenge emerged due to the absence of a publicly available reliable dataset for the detection of Mpox. The rapid escalation of Mpox cases, with its potential spread reaching Europe and America as highlighted by the World Health Organization, along with emerging possibilities of Mpox cases in Asian countries, underscored the urgency of implementing computer-assisted detection as a critical tool. In this context, the immediate diagnosis of Mpox became an increasingly challenging endeavor. As the possibility of a Mpox outbreak loomed over densely populated countries like Bangladesh, the limitations of our available resources rendered rapid diagnosis unattainable. Hence, the dire need for computer-assisted detection methods became apparent.

To address this pressing need, the development of computer-assisted methods demanded an ample amount of diverse data, including skin lesion images of Mpox from individuals of different sexes, ethnicities, and skin tones. However, the scarcity of available data posed a considerable obstacle in this endeavor. In response to this critical situation, our research group took the initiative to develop one of the earliest datasets (MSLD) specifically tailored for Mpox, encompassing various classes including non-Mpox samples.

From June 2022 to May 2023, the Mpox Skin Lesion Dataset (MSLD) has undergone two iterations, resulting in the current version, MSLD v2.0. The previous version included two classes: "Mpox" and "Others" (non-Mpox), with the "Others" class comprising skin lesion images of chickenpox and measles, chosen for their similarity to Mpox. Building upon the limitations identified in the initial release, we have developed an enhanced and more comprehensive version, MSLD v2.0. This updated dataset encompasses a wider range of classes and provides a more diverse set of images suitable for multi-class classification.

MSLD v2.0 comprises images from six distinct classes, namely Mpox (284 images), Chickenpox (75 images), Measles (55 images), Cowpox (66 images), Hand-foot-mouth disease or HFMD (161 images), and Healthy (114 images). The dataset includes 755 original skin lesion images sourced from 541 distinct patients, ensuring a representative sample. Importantly, the latest version has received endorsement from professional dermatologists and obtained approval from appropriate regulatory authorities.

Content

The dataset is organized into two folders:

Original Images: This folder includes a subfolder named "FOLDS" containing five folds (fold1-fold5) for 5-fold cross-validation with the original images. Each fold has separate folders for the test, train, and validation sets.

Augmented Images: To enhance the classification task, various data augmentation techniques, such as rotation, translation, reflection, shear, hue, saturation, contrast, brightness jitter, noise, and scaling, were applied using MATLAB R2020a. To ensure result reproducibility, the augmented images are provided in this folder. It contains a subfolder called "FOLDS_AUG" with augmented images of the train sets from each fold in the "FOLDS" subfolder of the "Original Images". The augmentation process resulted in an approximate 14-fold increase in the number of images.

Naming Convention of the Images

Each image is assigned a name following the format of DiseaseCode_PatientNumber_ImageNumber. The corresponding disease codes assigned to each of the six disease classes are - Mpox -> MKP, Chickenpox -> CHP, Cowpox -> CWP, Measles -> MSL, Hand,foot and mouth disease -> HFMD, Healthy -> HEALTHY. Assignment of the keywords is illustrated in the provided image "Keywords.jpg". For instance, an image named "MKP_17_01" indicates that it belongs to the Mpox class and is the first image captured from a patient with the ID 17.

Data organization

The dataset includes an Excel file named "**datalog.xlsx**" consisting of 5 sheets (Sheet1-5), with each sheet corresponding to a specific fold (fold1-5). Each sheet contains three columns: train, validation, and test. These columns contain the names of the images belonging to the respective train, validation, and test sets for a particular fold.

Web Application

Since we intend to build an end to end solution - starting with dataset creation and ending with a live web app, a prototype of the web-app has already been developed using the open-source python streamlit framework with a flask core and has been hosted in the streamlit provided server for better user experience. In the app, Skin Lesion Detector, users can get, not only a suggestion but also the accuracy of the suggestion.

The codes required to build and train the model, all ...
R
Cv Cbi Mining Safety Dataset
universe.roboflow.com
zip
Updated May 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MININGSAFETYCBI (2024). Cv Cbi Mining Safety Dataset [Dataset]. https://universe.roboflow.com/miningsafetycbi/cv-cbi-mining-safety/dataset/3
Explore at:
zipAvailable download formats
Dataset updated
May 29, 2024
Dataset authored and provided by
MININGSAFETYCBI
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Variables measured
Boots Helms Vests Boots Bounding Boxes
Description
Project Description for Roboflow: Mining Safety - PPE Detection

Project Name: Mining Safety - PPE Detection

Overview:

The Mining Safety - PPE Detection project aims to enhance safety protocols in mining environments by leveraging computer vision technology to detect Personal Protective Equipment (PPE). This project focuses on the detection of various PPE items and the absence of mandatory safety gear to ensure that workers adhere to safety regulations, thereby minimizing the risk of accidents and injuries.

Objective:

To develop a robust object detection model capable of accurately identifying 13 different classes of PPE in real-time using a dataset sourced from Roboflow Universe. The ultimate goal is to integrate this model into a monitoring system that can alert supervisors about non-compliance with PPE requirements in mining sites.

PPE Classes (Labels):

Goggles

Helmet

Mask

No-Boots

No-Gloves

No-Helmet

No-Mask

No-Vest

Undefined

Vest

Boots

Ear-Protection

Gloves

Dataset:

Total Images: 7444

Source: Roboflow Universe

Annotations: Each image is annotated with bounding boxes corresponding to one or more of the 13 PPE classes.

Image Variety: The images come from various mining sites with different lighting conditions, camera angles, and worker positions to ensure diversity and robustness of the model.

Project Steps:

Data Collection and Annotation:

Import and utilize the dataset from Roboflow Universe, ensuring it covers diverse conditions and scenarios.

Verify and, if necessary, re-annotate images to match the 13 PPE classes accurately using the Roboflow platform.

Data Preprocessing:

Perform data augmentation techniques such as rotation, scaling, and cropping to increase the variability and size of the dataset.

Split the dataset into training, validation, and test sets (e.g., 80% training, 10% validation, 10% test).

Model Selection and Training:

Use a pre-trained YOLO (You Only Look Once) model due to its efficiency and accuracy in real-time object detection tasks.

Fine-tune the model on the annotated dataset using transfer learning to adapt it specifically to the mining safety PPE detection task.

Model Evaluation:

Evaluate the model's performance using metrics such as precision, recall, F1-score, and mean Average Precision (mAP).

Conduct error analysis to identify common misclassifications and refine the model accordingly.

Deployment:

Integrate the trained model into a real-time monitoring system.

Develop a user interface that displays video feeds and highlights detected PPE and any non-compliance issues.

Implement alert mechanisms to notify supervisors of any detected safety violations.

Continuous Improvement:

Collect feedback from the deployment to continuously improve the model.

Regularly update the dataset with new images and retrain the model to maintain high accuracy.

Expected Outcomes:

A high-accuracy object detection model capable of identifying and differentiating between 13 classes of PPE.

Enhanced safety monitoring system for mining sites, reducing the likelihood of accidents due to non-compliance with PPE regulations.

A scalable solution that can be adapted to other industrial environments requiring PPE detection.

Tools and Technologies:

Annotation Tool: Roboflow

Object Detection Model: YOLO (preferably YOLOv8 or YOLOv9 for efficiency)

Programming Language: Python

Frameworks: PyTorch or TensorFlow for model training and inference

Deployment Platform: Docker for containerization and deployment on edge devices or cloud platforms

Monitoring and Alert System: Custom-built using Flask/Django (for web interface) and integrated with real-time notification services (e.g., Slack, email, SMS)

This project will significantly contribute to improving the safety standards in mining operations by ensuring that all workers are consistently wearing the required protective gear.
TinyML Market Analysis, Size, and Forecast 2025-2029: North America (US and...
technavio.com
pdf
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). TinyML Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, and UK), APAC (China, India, Japan, and South Korea), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/tinyml-market-industry-analysis
Explore at:
pdfAvailable download formats
Dataset updated
Jul 9, 2025
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2025 - 2029
Area covered
Canada, Germany, United Kingdom, United States
Description
Snapshot img

TinyML Market Size 2025-2029

The TinyML market size is forecast to increase by USD 5.67 billion at a CAGR of 34% between 2024 and 2029.

The market is experiencing significant growth, driven by the increasing proliferation of Internet of Things (IoT) devices and the imperative for edge intelligence. The need for real-time processing and decision-making at the edge is becoming increasingly crucial for IoT applications, leading to a rise in demand for lightweight machine learning models like TinyML. These applications require complex computations to be performed at the edge, making TinyML an attractive solution due to its small size and low power consumption. However, the market faces challenges in overcoming technical complexity and the scarcity of specialized talent. Moreover, the emergence of vision and advanced sensor fusion as key applications for TinyML is further fueling market growth. Developing and deploying TinyML models requires a deep understanding of machine learning algorithms and hardware constraints, making it a challenging task for many organizations. Additionally, the lack of skilled professionals in this area can hinder market growth. Companies seeking to capitalize on the opportunities presented by the market must invest in building the necessary expertise or partnering with specialized firms to overcome these challenges effectively.

What will be the Size of the TinyML Market during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample

In the realm of Machine Learning (ML), TinyML is gaining traction as a significant market segment, particularly in industrial automation and IoT applications. Performance benchmarking is crucial in this domain, focusing on inference speed, model deployment, and memory footprint. Industrial automation and medical devices are prime areas of application, where data compression and real-world applications are essential. Hardware platforms and microcontroller programming play a pivotal role in TinyML's success. Data security, data augmentation, and regularization techniques are vital for enhancing model accuracy and reliability. Testing methodologies, activation functions, and hyperparameter tuning are essential development tools for creating efficient ML models.

Ethical implications and privacy considerations are increasingly important in this market, with power consumption being a key concern. Smart agriculture and wearable technology are emerging sectors where TinyML's low power consumption and real-time inference capabilities offer significant advantages. Loss functions and gradient descent are essential techniques for model evaluation and optimization. Development tools and software libraries facilitate the creation of efficient ML models, enabling faster time-to-market and improved performance.

How is this TinyML Industry segmented?

The TinyML industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

Component Solutions Services Application Process optimization Health monitoring Smart agriculture Enviornmental monitoring Autonomous vehicles End-user Healthcare Manufacturing Consumer electronics Agriculture Others Geography North America US Canada Europe France Germany UK APAC China India Japan South Korea South America Brazil Rest of World (ROW)

By Component Insights

The Solutions segment is estimated to witness significant growth during the forecast period. The market is witnessing significant advancements in energy-efficient algorithms and neural network pruning, enabling on-device training of classification algorithms for IoT integration. Real-time processing and accuracy metrics are crucial for efficient deep learning, driving the adoption of microcontroller optimization and latency optimization techniques. Edge computing and signal processing are essential for performance evaluation and sensor data analytics, leading to the development of binary neural networks and time series analysis methods. Feature extraction and power constraints are key considerations for deployment strategies, which include sensor fusion and embedded machine learning.

Pattern recognition, model quantization, and ultra-low power solutions cater to resource-constrained devices, while on-device inference and low-power sensors optimize memory and hardware acceleration. Anomaly detection and regression models are integral to TinyML frameworks, enabling on-chip processing and real-time response. The market is characterized by intense competition, with hardware and software innovations prop
f
Results on di-peptide composition (DPC) data.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Waqar Ahmad; Abdul Raheem Shahzad; Muhammad Awais Amin; Waqas Haider Bangyal; Tahani Jaser Alahmadi; Saddam Hussain Khan (2025). Results on di-peptide composition (DPC) data. [Dataset]. http://doi.org/10.1371/journal.pone.0321761.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321761.t005
Dataset updated
Jun 18, 2025
Dataset provided by
PLOS ONE
Authors
Waqar Ahmad; Abdul Raheem Shahzad; Muhammad Awais Amin; Waqas Haider Bangyal; Tahani Jaser Alahmadi; Saddam Hussain Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The prevalence of Leukaemia, a malignant blood cancer that originates from hematopoietic progenitor cells, is increasing in Southeast Asia, with a worrisome fatality rate of 54%. Predicting outcomes in the early stages is vital for improving the chances of patient recovery. The aim of this research is to enhance early-stage prediction systems in a substantial manner. Using Machine Learning and Data Science, we exploit protein sequential data from commonly altered genes including BCL2, HSP90, PARP, and RB to make predictions for Chronic Myeloid Leukaemia (CML). The methodology we implement is based on the utilisation of reliable methods for extracting features, namely Di-peptide Composition (DPC), Amino Acid Composition (AAC), and Pseudo amino acid composition (Pse-AAC). We also take into consideration the identification and handling of outliers, as well as the validation of feature selection using the Pearson Correlation Coefficient (PCA). Data augmentation guarantees a comprehensive dataset for analysis. By utilising several Machine Learning models such as Support Vector Machine (SVM), XGBoost, Random Forest (RF), K Nearest Neighbour (KNN), Decision Tree (DT), and Logistic Regression (LR), we have achieved accuracy rates ranging from 66% to 94%. These classifiers are thoroughly evaluated utilising performance criteria such as accuracy, sensitivity, specificity, F1-score, and the confusion matrix.The solution we suggest is a user-friendly online application dashboard that can be used for early detection of CML. This tool has significant implications for practitioners and may be used in healthcare institutions and hospitals.
f
Raw data.
plos.figshare.com
xlsx
Updated Jun 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi (2025). Raw data. [Dataset]. http://doi.org/10.1371/journal.pone.0326339.s015
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0326339.s015
Dataset updated
Jun 24, 2025
Dataset provided by
PLOS ONE
Authors
Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Aim of the studyThe aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence (AI) algorithms in detecting meniscal injuries.Materials and methodsA systematic search was performed in the Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central databases on July, 2024. The included studies’ reporting quality and risk of bias were evaluated using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and the Prediction Model Study Risk of Bias Assessment Tool (PROBAST), respectively. Also, a meta-analysis was done using contingency tables to estimate diagnostic performance metrics (sensitivity and specificity), and a meta-regression analysis was performed to investigate the effect of the following variables on the main outcome: imaging view, data augmentation and transfer learning usage, and presence of meniscal tear in the injury, with a corresponding 95% confidence interval (CI) and a P-value of 0.05 as a threshold for significance.ResultsAmong 28 included studies, 92 contingency tables were extracted from 15 studies. The reference standard of the studies were mostly expert radiologists, orthopedics, or surgical reports. The pooled sensitivity and specificity for AI algorithms on internal validation were 81% (95% CI: 78, 85), and 78% (95% CI: 72, 83), and for clinicians on internal validation were 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. The pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88), and 88% (95% CI: 84, 91), respectively.ConclusionThe results of this study imply the lower diagnostic performance of AI-based algorithms in knee meniscal injuries compared with clinicians.
f
Study characteristics, external validation.
plos.figshare.com
xls
Updated Jun 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi (2025). Study characteristics, external validation. [Dataset]. http://doi.org/10.1371/journal.pone.0326339.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0326339.t002
Dataset updated
Jun 24, 2025
Dataset provided by
PLOS ONE
Authors
Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Aim of the studyThe aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence (AI) algorithms in detecting meniscal injuries.Materials and methodsA systematic search was performed in the Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central databases on July, 2024. The included studies’ reporting quality and risk of bias were evaluated using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and the Prediction Model Study Risk of Bias Assessment Tool (PROBAST), respectively. Also, a meta-analysis was done using contingency tables to estimate diagnostic performance metrics (sensitivity and specificity), and a meta-regression analysis was performed to investigate the effect of the following variables on the main outcome: imaging view, data augmentation and transfer learning usage, and presence of meniscal tear in the injury, with a corresponding 95% confidence interval (CI) and a P-value of 0.05 as a threshold for significance.ResultsAmong 28 included studies, 92 contingency tables were extracted from 15 studies. The reference standard of the studies were mostly expert radiologists, orthopedics, or surgical reports. The pooled sensitivity and specificity for AI algorithms on internal validation were 81% (95% CI: 78, 85), and 78% (95% CI: 72, 83), and for clinicians on internal validation were 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. The pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88), and 88% (95% CI: 84, 91), respectively.ConclusionThe results of this study imply the lower diagnostic performance of AI-based algorithms in knee meniscal injuries compared with clinicians.
f
Data from: Glucocorticoid augmentation of prolonged exposure therapy:...
tandf.figshare.com
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachel Yehuda; LindaM. Bierer; Laura Pratchett; Monica Malowney (2023). Glucocorticoid augmentation of prolonged exposure therapy: rationale and case report [Dataset]. http://doi.org/10.6084/m9.figshare.21776407.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21776407.v1
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Rachel Yehuda; LindaM. Bierer; Laura Pratchett; Monica Malowney
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Case reports Conclusions Prolonged exposure (PE) therapy has been found to reduce symptoms of posttraumatic stress disorder (PTSD); however, it is difficult for many patients to engage fully in the obligatory retelling of their traumatic experiences. This problem is compounded by the fact that habituation and cognitive restructuring – the main mechanisms through which PE is hypothesized to work – are not instantaneous processes, and often require several weeks before the distress associated with imaginal exposure abates.Two cases are described that respectively illustrate the use of hydrocortisone and placebo, in combination with PE, for the treatment of combat-related PTSD. Based on known effects of glucocorticoids on learning and memory performance, we hypothesized that augmentation with hydrocortisone would improve the therapeutic effects of PE by hastening “new” learning and facilitating decreases in the emotional impact of fear memories during the course of treatment. The veteran receiving hydrocortisone augmentation of PE displayed an accelerated and ultimately greater decline in PTSD symptoms than the veteran receiving placebo.While no general conclusion can be derived from comparison of two patients, the findings are consistent with the rationale for augmentation. These case reports support the potential for an appropriately designed and powered clinical trial to examine the efficacy of glucocorticoids in augmenting the effects of psychotherapy for PTSD. For the abstract or full text in other languages, please see Supplementary files (under Reading Tools online).
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Roboflow Community (2021). Packages Object Detection Dataset - augmented-v1 [Dataset]. https://public.roboflow.com/object-detection/packages-dataset/5

Packages Object Detection Dataset - augmented-v1

Explore at:

zipAvailable download formats

Dataset updated

Jan 14, 2021

Dataset provided by

Roboflow, Inc.

Authors

Roboflow Community

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Variables measured

Bounding Boxes of packages

Description

About This Dataset

The Roboflow Packages dataset is a collection of packages located at the doors of various apartments and homes. Packages are flat envelopes, small boxes, and large boxes. Some images contain multiple annotated packages.

Usage

This dataset may be used as a good starter dataset to track and identify when a package has been delivered to a home. Perhaps you want to know when a package arrives to claim it quickly or prevent package theft.

If you plan to use this dataset and adapt it to your own front door, it is recommended that you capture and add images from the context of your specific camera position. You can easily add images to this dataset via the web UI or via the Roboflow Upload API.

About Roboflow

Roboflow enables teams to build better computer vision models faster. We provide tools for image collection, organization, labeling, preprocessing, augmentation, training and deployment. :fa-spacer: Developers reduce boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

Clear search

Close search

Google apps

Main menu

Packages Object Detection Dataset - augmented-v1

About This Dataset

Usage

About Roboflow

Supplementary file 1_Data augmented lung cancer prediction framework using...

Data from: Augmentation of telemedicine post-operative follow-up after...

Literature collection of Text Data Augmentation

Data from: Combining Group Contribution Method and Semisupervised Learning...

Precision, recall, and F1-score for each class.

The model alterations description per epoch during training.

Federated Learning Market Analysis, Size, and Forecast 2025-2029: North...

Snapshot img

Confusion matrix (DPC data).

Bone Fracture X-ray Dataset: Simple vs. Comminuted Fractures

Comparing the mean age between skin cancer lesions.

Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0)

Context

Content

Naming Convention of the Images

Data organization

Web Application

Cv Cbi Mining Safety Dataset

Project Description for Roboflow: Mining Safety - PPE Detection

Project Name: Mining Safety - PPE Detection

Overview:

Objective:

PPE Classes (Labels):

Dataset:

Project Steps:

Expected Outcomes:

Tools and Technologies:

TinyML Market Analysis, Size, and Forecast 2025-2029: North America (US and...

Snapshot img

Results on di-peptide composition (DPC) data.

Raw data.

Study characteristics, external validation.

Data from: Glucocorticoid augmentation of prolonged exposure therapy:...

Packages Object Detection Dataset - augmented-v1See More Versions

About This Dataset

Usage

About Roboflow

Packages Object Detection Dataset - augmented-v1