18 datasets found
  1. Packages Object Detection Dataset - augmented-v1

    • public.roboflow.com
    zip
    Updated Jan 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow Community (2021). Packages Object Detection Dataset - augmented-v1 [Dataset]. https://public.roboflow.com/object-detection/packages-dataset/5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 14, 2021
    Dataset provided by
    Roboflow, Inc.
    Authors
    Roboflow Community
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of packages
    Description

    About This Dataset

    The Roboflow Packages dataset is a collection of packages located at the doors of various apartments and homes. Packages are flat envelopes, small boxes, and large boxes. Some images contain multiple annotated packages.

    Usage

    This dataset may be used as a good starter dataset to track and identify when a package has been delivered to a home. Perhaps you want to know when a package arrives to claim it quickly or prevent package theft.

    If you plan to use this dataset and adapt it to your own front door, it is recommended that you capture and add images from the context of your specific camera position. You can easily add images to this dataset via the web UI or via the Roboflow Upload API.

    About Roboflow

    Roboflow enables teams to build better computer vision models faster. We provide tools for image collection, organization, labeling, preprocessing, augmentation, training and deployment. :fa-spacer: Developers reduce boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

    Roboflow Wordmark

  2. f

    Supplementary file 1_Data augmented lung cancer prediction framework using...

    • frontiersin.figshare.com
    docx
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yifan Jiang; Venkata S. K. Manem (2025). Supplementary file 1_Data augmented lung cancer prediction framework using the nested case control NLST cohort.docx [Dataset]. http://doi.org/10.3389/fonc.2025.1492758.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    Frontiers
    Authors
    Yifan Jiang; Venkata S. K. Manem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PurposeIn the context of lung cancer screening, the scarcity of well-labeled medical images poses a significant challenge to implement supervised learning-based deep learning methods. While data augmentation is an effective technique for countering the difficulties caused by insufficient data, it has not been fully explored in the context of lung cancer screening. In this research study, we analyzed the state-of-the-art (SOTA) data augmentation techniques for lung cancer binary prediction.MethodsTo comprehensively evaluate the efficiency of data augmentation approaches, we considered the nested case control National Lung Screening Trial (NLST) cohort comprising of 253 individuals who had the commonly used CT scans without contrast. The CT scans were pre-processed into three-dimensional volumes based on the lung nodule annotations. Subsequently, we evaluated five basic (online) and two generative model-based offline data augmentation methods with ten state-of-the-art (SOTA) 3D deep learning-based lung cancer prediction models.ResultsOur results demonstrated that the performance improvement by data augmentation was highly dependent on approach used. The Cutmix method resulted in the highest average performance improvement across all three metrics: 1.07%, 3.29%, 1.19% for accuracy, F1 score and AUC, respectively. MobileNetV2 with a simple data augmentation approach achieved the best AUC of 0.8719 among all lung cancer predictors, demonstrating a 7.62% improvement compared to baseline. Furthermore, the MED-DDPM data augmentation approach was able to improve prediction performance by rebalancing the training set and adding moderately synthetic data.ConclusionsThe effectiveness of online and offline data augmentation methods were highly sensitive to the prediction model, highlighting the importance of carefully selecting the optimal data augmentation method. Our findings suggest that certain traditional methods can provide more stable and higher performance compared to SOTA online data augmentation approaches. Overall, these results offer meaningful insights for the development and clinical integration of data augmented deep learning tools for lung cancer screening.

  3. f

    Data from: Augmentation of telemedicine post-operative follow-up after...

    • datasetcatalog.nlm.nih.gov
    Updated Aug 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vagefi, M. Reza; Grob, Seanna R.; Ahmad, Meleha; Winn, Bryan J.; Smith, Loreley D.; Ashraf, Davin C.; Kersten, Robert C.; Miller, Amanda (2022). Augmentation of telemedicine post-operative follow-up after oculofacial plastic surgery with a self-guided patient tool [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000249833
    Explore at:
    Dataset updated
    Aug 3, 2022
    Authors
    Vagefi, M. Reza; Grob, Seanna R.; Ahmad, Meleha; Winn, Bryan J.; Smith, Loreley D.; Ashraf, Davin C.; Kersten, Robert C.; Miller, Amanda
    Description

    This study evaluates a web-based tool designed to augment telemedicine post-operative visits after periocular surgery. Adult, English-speaking patients undergoing periocular surgery with telemedicine follow-up were studied prospectively in this interventional case series. Participants submitted visual acuity measurements and photographs via a web-based tool prior to routine telemedicine post-operative visits. An after-visit survey assessed patient perceptions. Surgeons rated photographs and live video for quality and blurriness; external raters also evaluated photographs. Images were analyzed for facial centration, resolution, and algorithmically detected blur. Complications were recorded and graded for severity and relation to telemedicine. Seventy-nine patients were recruited. Surgeons requested an in-person assessment for six patients (7.6%) due to inadequate evaluation by telemedicine. Surgeons rated patient-provided photographs to be of higher quality than live video at the time of the post-operative visit (p < 0.001). Image blur and resolution had moderate and weak correlation with photograph quality, respectively. A photograph blur detection algorithm demonstrated sensitivity of 85.5% and specificity of 75.1%. One patient experienced a wound dehiscence with a possible relationship to inadequate evaluation during telemedicine follow-up. Patients rated the telemedicine experience and their comfort with the structure of the visit highly. Augmented telemedicine follow-up after oculofacial plastic surgery is associated with high patient satisfaction, rare conversion to clinic evaluation, and few related post-operative complications. Automated detection of image resolution and blur may play a role in screening photographs for subsequent iterations of the web-based tool.

  4. S

    Literature collection of Text Data Augmentation

    • scidb.cn
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Feng Ran (2024). Literature collection of Text Data Augmentation [Dataset]. http://doi.org/10.57760/sciencedb.j00133.00356
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 22, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Feng Ran
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    A list of references obtained by searching and screening two English databases, Web of Science (WOS) and Google Scholar, as well as two Chinese databases, CNKI and Wanfang Data, using "text enhancement" as the keyword. The time range is from 2015 to 2024, including descriptions of titles, enhancement methods, categories, datasets, and tools

  5. f

    Data from: Combining Group Contribution Method and Semisupervised Learning...

    • acs.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Dec 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhao Liu; Lanyu Shang; Kuan Huang; Zhenrui Yue; Alan Y. Han; Dong Wang; Huichun Zhang (2024). Combining Group Contribution Method and Semisupervised Learning to Build Machine Learning Models for Predicting Hydroxyl Radical Rate Constants of Water Contaminants [Dataset]. http://doi.org/10.1021/acs.est.4c11950.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 26, 2024
    Dataset provided by
    ACS Publications
    Authors
    Zhao Liu; Lanyu Shang; Kuan Huang; Zhenrui Yue; Alan Y. Han; Dong Wang; Huichun Zhang
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Machine learning is an effective tool for predicting reaction rate constants for many organic compounds with the hydroxyl radical (HO•). Previously reported models have achieved relatively good performance, but due to scarce data (

  6. f

    Precision, recall, and F1-score for each class.

    • plos.figshare.com
    xls
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuki Wong; Eileen Lee Ming Su; Che Fai Yeong; William Holderbaum; Chenguang Yang (2025). Precision, recall, and F1-score for each class. [Dataset]. http://doi.org/10.1371/journal.pone.0322624.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 9, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Yuki Wong; Eileen Lee Ming Su; Che Fai Yeong; William Holderbaum; Chenguang Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Brain tumors pose a significant medical challenge, necessitating early detection and precise classification for effective treatment. This study aims to address this challenge by introducing an automated brain tumor classification system that utilizes deep learning (DL) and Magnetic Resonance Imaging (MRI) images. The main purpose of this research is to develop a model that can accurately detect and classify different types of brain tumors, including glioma, meningioma, pituitary tumors, and normal brain scans. A convolutional neural network (CNN) architecture with pretrained VGG16 as the base model is employed, and diverse public datasets are utilized to ensure comprehensive representation. Data augmentation techniques are employed to enhance the training dataset, resulting in a total of 17,136 brain MRI images across the four classes. The accuracy of this model was 99.24%, a higher accuracy than other similar works, demonstrating its potential clinical utility. This higher accuracy was achieved mainly due to the utilization of a large and diverse dataset, the improvement of network configuration, the application of a fine-tuning strategy to adjust pretrained weights, and the implementation of data augmentation techniques in enhancing classification performance for brain tumor detection. In addition, a web application was developed by leveraging HTML and Dash components to enhance usability, allowing for easy image upload and tumor prediction. By harnessing artificial intelligence (AI), the developed system addresses the need to reduce human error and enhance diagnostic accuracy. The proposed approach provides an efficient and reliable solution for brain tumor classification, facilitating early diagnosis and enabling timely medical interventions. This work signifies a potential advancement in brain tumor classification, promising improved patient care and outcomes.

  7. f

    The model alterations description per epoch during training.

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri (2023). The model alterations description per epoch during training. [Dataset]. http://doi.org/10.1371/journal.pone.0284437.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The model alterations description per epoch during training.

  8. Federated Learning Market Analysis, Size, and Forecast 2025-2029: North...

    • technavio.com
    Updated Jul 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Federated Learning Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, and UK), APAC (China, India, and Japan), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/federated-learning-market-industry-analysis
    Explore at:
    Dataset updated
    Jul 2, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Mexico, Brazil, Canada, Germany, United States, Global
    Description

    Snapshot img

    Federated Learning Market Size 2025-2029

    The federated learning market size is forecast to increase by USD 301.1 million at a CAGR of 15.9% between 2024 and 2029.

    The market is experiencing significant growth, driven by increasing data privacy regulations and growing privacy concerns. As organizations seek to protect sensitive data while still leveraging machine learning capabilities, federated learning's decentralized approach offers an attractive solution. However, this market is not without challenges. The rise of vertical-specific federated learning platforms necessitates a focused approach for companies looking to capitalize on this technology. Machine learning algorithms, including random forests, naive Bayes, decision trees, clustering algorithms, and k-nearest neighbors, are essential tools for risk management and compliance monitoring.
    Companies must navigate these challenges to effectively implement federated learning and reap the rewards of improved data privacy and enhanced machine learning capabilities. By addressing these obstacles, organizations can successfully harness the power of federated learning to gain a competitive edge in their industries. Additionally, data security and privacy concerns are becoming increasingly important, requiring strong security measures to protect sensitive industrial data. Furthermore, the significant technical complexity and statistical heterogeneity associated with federated learning require specialized expertise and resources.
    

    What will be the Size of the Federated Learning Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free Sample

    In the dynamic world of machine learning, federated learning is gaining traction as a promising approach for personalized learning and recommendation systems. Reinforcement learning techniques are being employed to optimize models in a decentralized manner, enhancing model robustness and adaptability. Online learning environments leverage federated analytics to deliver customized educational content, while collaboration between learners is facilitated through collaborative filtering. Identity management and privacy impact assessments are crucial components of federated learning, ensuring data lineage and provenance are maintained. Security measures, such as model robustness assessments, penetration testing, threat modeling, and encryption keys, safeguard against adversarial attacks and data breaches. Big data and AI are at the heart of this transformation, with machine learning algorithms enabling real-time monitoring and predictive maintenance of industrial machines.

    Ensemble methods and key management systems contribute to improved model accuracy and access control, respectively. Decentralized identity solutions, like smart contracts and blockchain technology, provide secure and private data sharing mechanisms. Data augmentation and transfer learning techniques enable model training on distributed data, further enhancing the effectiveness of federated learning. Zero-knowledge proofs and adaptive learning algorithms enable secure and efficient knowledge sharing, fostering a collaborative learning environment. Overall, federated learning continues to evolve, offering significant potential for online learning and recommendation systems while addressing data security and privacy concerns.

    How is this Federated Learning Industry segmented?

    The federated learning industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Deployment
    
      Cloud
      On-premises
    
    
    Type
    
      Horizontal federated learning
      Vertical federated learning
      Federated transfer learning
    
    
    End-user
    
      Healthcare
      BFSI
      Manufacturing
      Automotive
      IT and telecom
    
    
    Technology
    
      Federated averaging
      Differential privacy
      Homomorphic encryption
    
    
    Geography
    
      North America
    
        US
        Canada
        Mexico
    
    
      Europe
    
        France
        Germany
        UK
    
    
      APAC
    
        China
        India
        Japan
    
    
      South America
    
        Brazil
    
    
      Rest of World (ROW)
    

    By Deployment Insights

    The Cloud segment is estimated to witness significant growth during the forecast period. The market is witnessing significant growth, driven by the integration of various advanced technologies. Artificial intelligence, neural networks, and deep learning are at the core of this evolution, enabling model training on decentralized data without compromising data privacy and security. Edge computing plays a crucial role in this paradigm, allowing for local processing and model compression, reducing power consumption and data transmission. Data silos and privacy concerns are addressed through techniques su

  9. f

    Confusion matrix (DPC data).

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jun 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khan, Saddam Hussain; Shahzad, Abdul Raheem; Bangyal, Waqas Haider; Amin, Muhammad Awais; Ahmad, Waqar; Alahmadi, Tahani Jaser (2025). Confusion matrix (DPC data). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002060685
    Explore at:
    Dataset updated
    Jun 18, 2025
    Authors
    Khan, Saddam Hussain; Shahzad, Abdul Raheem; Bangyal, Waqas Haider; Amin, Muhammad Awais; Ahmad, Waqar; Alahmadi, Tahani Jaser
    Description

    The prevalence of Leukaemia, a malignant blood cancer that originates from hematopoietic progenitor cells, is increasing in Southeast Asia, with a worrisome fatality rate of 54%. Predicting outcomes in the early stages is vital for improving the chances of patient recovery. The aim of this research is to enhance early-stage prediction systems in a substantial manner. Using Machine Learning and Data Science, we exploit protein sequential data from commonly altered genes including BCL2, HSP90, PARP, and RB to make predictions for Chronic Myeloid Leukaemia (CML). The methodology we implement is based on the utilisation of reliable methods for extracting features, namely Di-peptide Composition (DPC), Amino Acid Composition (AAC), and Pseudo amino acid composition (Pse-AAC). We also take into consideration the identification and handling of outliers, as well as the validation of feature selection using the Pearson Correlation Coefficient (PCA). Data augmentation guarantees a comprehensive dataset for analysis. By utilising several Machine Learning models such as Support Vector Machine (SVM), XGBoost, Random Forest (RF), K Nearest Neighbour (KNN), Decision Tree (DT), and Logistic Regression (LR), we have achieved accuracy rates ranging from 66% to 94%. These classifiers are thoroughly evaluated utilising performance criteria such as accuracy, sensitivity, specificity, F1-score, and the confusion matrix.The solution we suggest is a user-friendly online application dashboard that can be used for early detection of CML. This tool has significant implications for practitioners and may be used in healthcare institutions and hospitals.

  10. m

    Bone Fracture X-ray Dataset: Simple vs. Comminuted Fractures

    • data.mendeley.com
    Updated Dec 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahim Faisal Talha Talha (2024). Bone Fracture X-ray Dataset: Simple vs. Comminuted Fractures [Dataset]. http://doi.org/10.17632/vg95gvhj3y.3
    Explore at:
    Dataset updated
    Dec 3, 2024
    Authors
    Fahim Faisal Talha Talha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Overview: This dataset has been curated to support research in bone fracture classification, focusing on simple and comminuted fractures. It includes high-quality X-ray images and a diverse set of augmented images to facilitate the development and evaluation of machine learning models in medical imaging. The dataset is ideal for image classification, segmentation, and fracture type recognition tasks.

    Fracture Categories: 1. Simple Fracture: Source: Exclusively sourced from hospital records. Original Images: 1,211 images. Augmented Images: 6,311 images. Total Images: 7,522 images. 2. Comminuted Fracture: Source: A mix of hospital-sourced images and web-sourced images (approximately one-tenth from web pages). Original Images: 1,173 images. Augmented Images: 7,366 images. Total Images: 8,539 images. Key Features: Number of Original Images: 2,384. Number of Augmented Images: 13,677. Total Dataset Size: 16,061 images (Original + Augmented). File Formats: JPG. Augmentation Techniques: Zoom: Randomized scaling. Rotation: ±30° rotation. Brightness and Contrast Adjustments: 80%–120% range. Flips: Vertical and horizontal flips.

    Update Details (Version 3): Original Submission Date: 15 Nov 2024. Update Date: 02 Dec 2024.

    Changes in This Version: Expanded Dataset Size: From: 6,798 total images (previous version). To: 16,061 total images.

    Augmented Images: Simple Fracture: Increased from 3,280 to 6,311 images. Comminuted Fracture: Increased from 2,900 to 7,366 images. Augmentation Enhancements: Additional transformations were applied to ensure varied and realistic images for training.

    Applications:

    Medical Imaging: Train and evaluate models for fracture classification and identification. Healthcare Technology: Support the development of diagnostic tools and mobile applications for real-time fracture detection. Medical Research: Aid in understanding fracture patterns and their visual indicators.

    Dataset Collection: Simple Fracture Images: Exclusively sourced from hospitals to ensure clinical accuracy and relevance. Comminuted Fracture Images: Sourced from a mix of hospital records and web pages, representing diverse imaging conditions and perspectives.

    Realistic Scenarios: The dataset simulates real-world clinical settings with varying lighting conditions, orientations, and imaging environments to provide robust training data. This dataset is a valuable resource for advancing medical imaging research and developing machine learning models tailored for fracture detection and classification tasks.

  11. Comparing the mean age between skin cancer lesions.

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri (2023). Comparing the mean age between skin cancer lesions. [Dataset]. http://doi.org/10.1371/journal.pone.0284437.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Amin Tajerian; Mohsen Kazemian; Mohammad Tajerian; Ava Akhavan Malayeri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparing the mean age between skin cancer lesions.

  12. Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0)

    • kaggle.com
    Updated Jul 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joydip Paul (2023). Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0) [Dataset]. https://www.kaggle.com/datasets/joydippaul/mpox-skin-lesion-dataset-version-20-msld-v20/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 4, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Joydip Paul
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Context

    During the initial peak outbreak phase of Mpox, a significant challenge emerged due to the absence of a publicly available reliable dataset for the detection of Mpox. The rapid escalation of Mpox cases, with its potential spread reaching Europe and America as highlighted by the World Health Organization, along with emerging possibilities of Mpox cases in Asian countries, underscored the urgency of implementing computer-assisted detection as a critical tool. In this context, the immediate diagnosis of Mpox became an increasingly challenging endeavor. As the possibility of a Mpox outbreak loomed over densely populated countries like Bangladesh, the limitations of our available resources rendered rapid diagnosis unattainable. Hence, the dire need for computer-assisted detection methods became apparent.

    To address this pressing need, the development of computer-assisted methods demanded an ample amount of diverse data, including skin lesion images of Mpox from individuals of different sexes, ethnicities, and skin tones. However, the scarcity of available data posed a considerable obstacle in this endeavor. In response to this critical situation, our research group took the initiative to develop one of the earliest datasets (MSLD) specifically tailored for Mpox, encompassing various classes including non-Mpox samples.

    From June 2022 to May 2023, the Mpox Skin Lesion Dataset (MSLD) has undergone two iterations, resulting in the current version, MSLD v2.0. The previous version included two classes: "Mpox" and "Others" (non-Mpox), with the "Others" class comprising skin lesion images of chickenpox and measles, chosen for their similarity to Mpox. Building upon the limitations identified in the initial release, we have developed an enhanced and more comprehensive version, MSLD v2.0. This updated dataset encompasses a wider range of classes and provides a more diverse set of images suitable for multi-class classification.

    MSLD v2.0 comprises images from six distinct classes, namely Mpox (284 images), Chickenpox (75 images), Measles (55 images), Cowpox (66 images), Hand-foot-mouth disease or HFMD (161 images), and Healthy (114 images). The dataset includes 755 original skin lesion images sourced from 541 distinct patients, ensuring a representative sample. Importantly, the latest version has received endorsement from professional dermatologists and obtained approval from appropriate regulatory authorities.

    Content

    The dataset is organized into two folders:

    Original Images: This folder includes a subfolder named "FOLDS" containing five folds (fold1-fold5) for 5-fold cross-validation with the original images. Each fold has separate folders for the test, train, and validation sets.

    Augmented Images: To enhance the classification task, various data augmentation techniques, such as rotation, translation, reflection, shear, hue, saturation, contrast, brightness jitter, noise, and scaling, were applied using MATLAB R2020a. To ensure result reproducibility, the augmented images are provided in this folder. It contains a subfolder called "FOLDS_AUG" with augmented images of the train sets from each fold in the "FOLDS" subfolder of the "Original Images". The augmentation process resulted in an approximate 14-fold increase in the number of images.

    Naming Convention of the Images

    Each image is assigned a name following the format of DiseaseCode_PatientNumber_ImageNumber. The corresponding disease codes assigned to each of the six disease classes are - Mpox -> MKP, Chickenpox -> CHP, Cowpox -> CWP, Measles -> MSL, Hand,foot and mouth disease -> HFMD, Healthy -> HEALTHY. Assignment of the keywords is illustrated in the provided image "Keywords.jpg". For instance, an image named "MKP_17_01" indicates that it belongs to the Mpox class and is the first image captured from a patient with the ID 17.

    Data organization

    The dataset includes an Excel file named "**datalog.xlsx**" consisting of 5 sheets (Sheet1-5), with each sheet corresponding to a specific fold (fold1-5). Each sheet contains three columns: train, validation, and test. These columns contain the names of the images belonging to the respective train, validation, and test sets for a particular fold.

    Web Application

    Since we intend to build an end to end solution - starting with dataset creation and ending with a live web app, a prototype of the web-app has already been developed using the open-source python streamlit framework with a flask core and has been hosted in the streamlit provided server for better user experience. In the app, Skin Lesion Detector, users can get, not only a suggestion but also the accuracy of the suggestion.

    The codes required to build and train the model, all ...

  13. R

    Cv Cbi Mining Safety Dataset

    • universe.roboflow.com
    zip
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MININGSAFETYCBI (2024). Cv Cbi Mining Safety Dataset [Dataset]. https://universe.roboflow.com/miningsafetycbi/cv-cbi-mining-safety/dataset/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 29, 2024
    Dataset authored and provided by
    MININGSAFETYCBI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Boots Helms Vests Boots Bounding Boxes
    Description

    Project Description for Roboflow: Mining Safety - PPE Detection

    Project Name: Mining Safety - PPE Detection

    Overview:

    The Mining Safety - PPE Detection project aims to enhance safety protocols in mining environments by leveraging computer vision technology to detect Personal Protective Equipment (PPE). This project focuses on the detection of various PPE items and the absence of mandatory safety gear to ensure that workers adhere to safety regulations, thereby minimizing the risk of accidents and injuries.

    Objective:

    To develop a robust object detection model capable of accurately identifying 13 different classes of PPE in real-time using a dataset sourced from Roboflow Universe. The ultimate goal is to integrate this model into a monitoring system that can alert supervisors about non-compliance with PPE requirements in mining sites.

    PPE Classes (Labels):

    1. Goggles
    2. Helmet
    3. Mask
    4. No-Boots
    5. No-Gloves
    6. No-Helmet
    7. No-Mask
    8. No-Vest
    9. Undefined
    10. Vest
    11. Boots
    12. Ear-Protection
    13. Gloves

    Dataset:

    • Total Images: 7444
    • Source: Roboflow Universe
    • Annotations: Each image is annotated with bounding boxes corresponding to one or more of the 13 PPE classes.
    • Image Variety: The images come from various mining sites with different lighting conditions, camera angles, and worker positions to ensure diversity and robustness of the model.

    Project Steps:

    1. Data Collection and Annotation:

      • Import and utilize the dataset from Roboflow Universe, ensuring it covers diverse conditions and scenarios.
      • Verify and, if necessary, re-annotate images to match the 13 PPE classes accurately using the Roboflow platform.
    2. Data Preprocessing:

      • Perform data augmentation techniques such as rotation, scaling, and cropping to increase the variability and size of the dataset.
      • Split the dataset into training, validation, and test sets (e.g., 80% training, 10% validation, 10% test).
    3. Model Selection and Training:

      • Use a pre-trained YOLO (You Only Look Once) model due to its efficiency and accuracy in real-time object detection tasks.
      • Fine-tune the model on the annotated dataset using transfer learning to adapt it specifically to the mining safety PPE detection task.
    4. Model Evaluation:

      • Evaluate the model's performance using metrics such as precision, recall, F1-score, and mean Average Precision (mAP).
      • Conduct error analysis to identify common misclassifications and refine the model accordingly.
    5. Deployment:

      • Integrate the trained model into a real-time monitoring system.
      • Develop a user interface that displays video feeds and highlights detected PPE and any non-compliance issues.
      • Implement alert mechanisms to notify supervisors of any detected safety violations.
    6. Continuous Improvement:

      • Collect feedback from the deployment to continuously improve the model.
      • Regularly update the dataset with new images and retrain the model to maintain high accuracy.

    Expected Outcomes:

    • A high-accuracy object detection model capable of identifying and differentiating between 13 classes of PPE.
    • Enhanced safety monitoring system for mining sites, reducing the likelihood of accidents due to non-compliance with PPE regulations.
    • A scalable solution that can be adapted to other industrial environments requiring PPE detection.

    Tools and Technologies:

    • Annotation Tool: Roboflow
    • Object Detection Model: YOLO (preferably YOLOv8 or YOLOv9 for efficiency)
    • Programming Language: Python
    • Frameworks: PyTorch or TensorFlow for model training and inference
    • Deployment Platform: Docker for containerization and deployment on edge devices or cloud platforms
    • Monitoring and Alert System: Custom-built using Flask/Django (for web interface) and integrated with real-time notification services (e.g., Slack, email, SMS)

    This project will significantly contribute to improving the safety standards in mining operations by ensuring that all workers are consistently wearing the required protective gear.

  14. TinyML Market Analysis, Size, and Forecast 2025-2029: North America (US and...

    • technavio.com
    pdf
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). TinyML Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, and UK), APAC (China, India, Japan, and South Korea), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/tinyml-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2025 - 2029
    Area covered
    Canada, Germany, United Kingdom, United States
    Description

    Snapshot img

    TinyML Market Size 2025-2029

    The TinyML market size is forecast to increase by USD 5.67 billion at a CAGR of 34% between 2024 and 2029.

    The market is experiencing significant growth, driven by the increasing proliferation of Internet of Things (IoT) devices and the imperative for edge intelligence. The need for real-time processing and decision-making at the edge is becoming increasingly crucial for IoT applications, leading to a rise in demand for lightweight machine learning models like TinyML. These applications require complex computations to be performed at the edge, making TinyML an attractive solution due to its small size and low power consumption. However, the market faces challenges in overcoming technical complexity and the scarcity of specialized talent.
    Moreover, the emergence of vision and advanced sensor fusion as key applications for TinyML is further fueling market growth. Developing and deploying TinyML models requires a deep understanding of machine learning algorithms and hardware constraints, making it a challenging task for many organizations. Additionally, the lack of skilled professionals in this area can hinder market growth. Companies seeking to capitalize on the opportunities presented by the market must invest in building the necessary expertise or partnering with specialized firms to overcome these challenges effectively.
    

    What will be the Size of the TinyML Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free Sample

    In the realm of Machine Learning (ML), TinyML is gaining traction as a significant market segment, particularly in industrial automation and IoT applications. Performance benchmarking is crucial in this domain, focusing on inference speed, model deployment, and memory footprint. Industrial automation and medical devices are prime areas of application, where data compression and real-world applications are essential. Hardware platforms and microcontroller programming play a pivotal role in TinyML's success. Data security, data augmentation, and regularization techniques are vital for enhancing model accuracy and reliability. Testing methodologies, activation functions, and hyperparameter tuning are essential development tools for creating efficient ML models.

    Ethical implications and privacy considerations are increasingly important in this market, with power consumption being a key concern. Smart agriculture and wearable technology are emerging sectors where TinyML's low power consumption and real-time inference capabilities offer significant advantages. Loss functions and gradient descent are essential techniques for model evaluation and optimization. Development tools and software libraries facilitate the creation of efficient ML models, enabling faster time-to-market and improved performance.

    How is this TinyML Industry segmented?

    The TinyML industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Component
    
      Solutions
      Services
    
    
    Application
    
      Process optimization
      Health monitoring
      Smart agriculture
      Enviornmental monitoring
      Autonomous vehicles
    
    
    End-user
    
      Healthcare
      Manufacturing
      Consumer electronics
      Agriculture
      Others
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        UK
    
    
      APAC
    
        China
        India
        Japan
        South Korea
    
    
      South America
    
        Brazil
    
    
      Rest of World (ROW)
    

    By Component Insights

    The Solutions segment is estimated to witness significant growth during the forecast period. The market is witnessing significant advancements in energy-efficient algorithms and neural network pruning, enabling on-device training of classification algorithms for IoT integration. Real-time processing and accuracy metrics are crucial for efficient deep learning, driving the adoption of microcontroller optimization and latency optimization techniques. Edge computing and signal processing are essential for performance evaluation and sensor data analytics, leading to the development of binary neural networks and time series analysis methods. Feature extraction and power constraints are key considerations for deployment strategies, which include sensor fusion and embedded machine learning.

    Pattern recognition, model quantization, and ultra-low power solutions cater to resource-constrained devices, while on-device inference and low-power sensors optimize memory and hardware acceleration. Anomaly detection and regression models are integral to TinyML frameworks, enabling on-chip processing and real-time response. The market is characterized by intense competition, with hardware and software innovations prop

  15. f

    Results on di-peptide composition (DPC) data.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waqar Ahmad; Abdul Raheem Shahzad; Muhammad Awais Amin; Waqas Haider Bangyal; Tahani Jaser Alahmadi; Saddam Hussain Khan (2025). Results on di-peptide composition (DPC) data. [Dataset]. http://doi.org/10.1371/journal.pone.0321761.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Waqar Ahmad; Abdul Raheem Shahzad; Muhammad Awais Amin; Waqas Haider Bangyal; Tahani Jaser Alahmadi; Saddam Hussain Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The prevalence of Leukaemia, a malignant blood cancer that originates from hematopoietic progenitor cells, is increasing in Southeast Asia, with a worrisome fatality rate of 54%. Predicting outcomes in the early stages is vital for improving the chances of patient recovery. The aim of this research is to enhance early-stage prediction systems in a substantial manner. Using Machine Learning and Data Science, we exploit protein sequential data from commonly altered genes including BCL2, HSP90, PARP, and RB to make predictions for Chronic Myeloid Leukaemia (CML). The methodology we implement is based on the utilisation of reliable methods for extracting features, namely Di-peptide Composition (DPC), Amino Acid Composition (AAC), and Pseudo amino acid composition (Pse-AAC). We also take into consideration the identification and handling of outliers, as well as the validation of feature selection using the Pearson Correlation Coefficient (PCA). Data augmentation guarantees a comprehensive dataset for analysis. By utilising several Machine Learning models such as Support Vector Machine (SVM), XGBoost, Random Forest (RF), K Nearest Neighbour (KNN), Decision Tree (DT), and Logistic Regression (LR), we have achieved accuracy rates ranging from 66% to 94%. These classifiers are thoroughly evaluated utilising performance criteria such as accuracy, sensitivity, specificity, F1-score, and the confusion matrix.The solution we suggest is a user-friendly online application dashboard that can be used for early detection of CML. This tool has significant implications for practitioners and may be used in healthcare institutions and hospitals.

  16. f

    Raw data.

    • plos.figshare.com
    xlsx
    Updated Jun 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi (2025). Raw data. [Dataset]. http://doi.org/10.1371/journal.pone.0326339.s015
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 24, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aim of the studyThe aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence (AI) algorithms in detecting meniscal injuries.Materials and methodsA systematic search was performed in the Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central databases on July, 2024. The included studies’ reporting quality and risk of bias were evaluated using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and the Prediction Model Study Risk of Bias Assessment Tool (PROBAST), respectively. Also, a meta-analysis was done using contingency tables to estimate diagnostic performance metrics (sensitivity and specificity), and a meta-regression analysis was performed to investigate the effect of the following variables on the main outcome: imaging view, data augmentation and transfer learning usage, and presence of meniscal tear in the injury, with a corresponding 95% confidence interval (CI) and a P-value of 0.05 as a threshold for significance.ResultsAmong 28 included studies, 92 contingency tables were extracted from 15 studies. The reference standard of the studies were mostly expert radiologists, orthopedics, or surgical reports. The pooled sensitivity and specificity for AI algorithms on internal validation were 81% (95% CI: 78, 85), and 78% (95% CI: 72, 83), and for clinicians on internal validation were 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. The pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88), and 88% (95% CI: 84, 91), respectively.ConclusionThe results of this study imply the lower diagnostic performance of AI-based algorithms in knee meniscal injuries compared with clinicians.

  17. f

    Study characteristics, external validation.

    • plos.figshare.com
    xls
    Updated Jun 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi (2025). Study characteristics, external validation. [Dataset]. http://doi.org/10.1371/journal.pone.0326339.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 24, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Soheil Mohammadi; Ali Jahanshahi; Mohammad Shahrabi Farahani; Mohammad Amin Salehi; Negin Frounchi; Ali Guermazi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aim of the studyThe aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence (AI) algorithms in detecting meniscal injuries.Materials and methodsA systematic search was performed in the Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central databases on July, 2024. The included studies’ reporting quality and risk of bias were evaluated using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and the Prediction Model Study Risk of Bias Assessment Tool (PROBAST), respectively. Also, a meta-analysis was done using contingency tables to estimate diagnostic performance metrics (sensitivity and specificity), and a meta-regression analysis was performed to investigate the effect of the following variables on the main outcome: imaging view, data augmentation and transfer learning usage, and presence of meniscal tear in the injury, with a corresponding 95% confidence interval (CI) and a P-value of 0.05 as a threshold for significance.ResultsAmong 28 included studies, 92 contingency tables were extracted from 15 studies. The reference standard of the studies were mostly expert radiologists, orthopedics, or surgical reports. The pooled sensitivity and specificity for AI algorithms on internal validation were 81% (95% CI: 78, 85), and 78% (95% CI: 72, 83), and for clinicians on internal validation were 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. The pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88), and 88% (95% CI: 84, 91), respectively.ConclusionThe results of this study imply the lower diagnostic performance of AI-based algorithms in knee meniscal injuries compared with clinicians.

  18. f

    Data from: Glucocorticoid augmentation of prolonged exposure therapy:...

    • tandf.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachel Yehuda; LindaM. Bierer; Laura Pratchett; Monica Malowney (2023). Glucocorticoid augmentation of prolonged exposure therapy: rationale and case report [Dataset]. http://doi.org/10.6084/m9.figshare.21776407.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Rachel Yehuda; LindaM. Bierer; Laura Pratchett; Monica Malowney
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Case reports Conclusions Prolonged exposure (PE) therapy has been found to reduce symptoms of posttraumatic stress disorder (PTSD); however, it is difficult for many patients to engage fully in the obligatory retelling of their traumatic experiences. This problem is compounded by the fact that habituation and cognitive restructuring – the main mechanisms through which PE is hypothesized to work – are not instantaneous processes, and often require several weeks before the distress associated with imaginal exposure abates.Two cases are described that respectively illustrate the use of hydrocortisone and placebo, in combination with PE, for the treatment of combat-related PTSD. Based on known effects of glucocorticoids on learning and memory performance, we hypothesized that augmentation with hydrocortisone would improve the therapeutic effects of PE by hastening “new” learning and facilitating decreases in the emotional impact of fear memories during the course of treatment. The veteran receiving hydrocortisone augmentation of PE displayed an accelerated and ultimately greater decline in PTSD symptoms than the veteran receiving placebo.While no general conclusion can be derived from comparison of two patients, the findings are consistent with the rationale for augmentation. These case reports support the potential for an appropriately designed and powered clinical trial to examine the efficacy of glucocorticoids in augmenting the effects of psychotherapy for PTSD. For the abstract or full text in other languages, please see Supplementary files (under Reading Tools online).

  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Roboflow Community (2021). Packages Object Detection Dataset - augmented-v1 [Dataset]. https://public.roboflow.com/object-detection/packages-dataset/5
Organization logo

Packages Object Detection Dataset - augmented-v1

Explore at:
zipAvailable download formats
Dataset updated
Jan 14, 2021
Dataset provided by
Roboflow, Inc.
Authors
Roboflow Community
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Variables measured
Bounding Boxes of packages
Description

About This Dataset

The Roboflow Packages dataset is a collection of packages located at the doors of various apartments and homes. Packages are flat envelopes, small boxes, and large boxes. Some images contain multiple annotated packages.

Usage

This dataset may be used as a good starter dataset to track and identify when a package has been delivered to a home. Perhaps you want to know when a package arrives to claim it quickly or prevent package theft.

If you plan to use this dataset and adapt it to your own front door, it is recommended that you capture and add images from the context of your specific camera position. You can easily add images to this dataset via the web UI or via the Roboflow Upload API.

About Roboflow

Roboflow enables teams to build better computer vision models faster. We provide tools for image collection, organization, labeling, preprocessing, augmentation, training and deployment. :fa-spacer: Developers reduce boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

Roboflow Wordmark

Search
Clear search
Close search
Google apps
Main menu