6 datasets found
  1. rsna-mammography-768-vl-perlabel

    • kaggle.com
    zip
    Updated Feb 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yacine Bouaouni (2023). rsna-mammography-768-vl-perlabel [Dataset]. https://www.kaggle.com/datasets/jarvisai7/rsna-mammography-768-vl-perlabel
    Explore at:
    zip(8268291887 bytes)Available download formats
    Dataset updated
    Feb 12, 2023
    Authors
    Yacine Bouaouni
    Description

    This dataset contains images in png format for RSNA Screening Mammography Breast Cancer Detection competition. All the images have a 768x768 size and are organized in two folders (one for each label). This allows inferring the labels from the folders (which might be helpful when using keras image_dataset_from_directory for example). - The processing of the DICOM files was done in this notebook by (Radek Osmulski ): https://www.kaggle.com/code/radek1/how-to-process-dicom-images-to-pngs?scriptVersionId=113529850 - The contribution made in this dataset is to organize the files in a way that enables label inferring from the directory name.

    🎉If this dataset helps you in your work don't hesitate to upvote 🎉

  2. Multi-Source Plant Disease (MSPD) Dataset

    • kaggle.com
    zip
    Updated Nov 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    synyyy (2025). Multi-Source Plant Disease (MSPD) Dataset [Dataset]. https://www.kaggle.com/datasets/synyyy/plantdataset
    Explore at:
    zip(3436366888 bytes)Available download formats
    Dataset updated
    Nov 3, 2025
    Authors
    synyyy
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Overview

    This dataset is a cleaned, standardized, and comprehensive collection of plant leaf images designed for training high-accuracy classification models. It addresses a common challenge in agricultural computer vision by merging four popular but distinctly formatted datasets (PlantVillage, PlantDoc, PlantWild, and PlantSeg).

    The primary goal is to provide a clean and robust dataset. All images have been organized by crop_name/disease_name/image.jpg, and all directory names have been standardized to a snake_case format. Furthermore, ambiguous or duplicate class names (e.g., Apple scab and Scab) have been merged into single, unified categories.

    This makes the dataset directly compatible with modern deep learning frameworks like PyTorch and TensorFlow.

    Directory Structure

    The data is organized in a hierarchical format perfect for use with ImageFolder-style data loaders. All directories have been standardized to lowercase snake_case.

    / ├── apple/ │ ├── scab/ │ │ ├── image1.jpg │ │ └── ... │ ├── black_rot/ │ └── ... ├── tomato/ │ ├── bacterial_spot/ │ └── ... ├── corn/ │ └── ... └── ...

    • Top-level: crop_name (e.g., apple, tomato, corn)
    • First sub-level: disease_name (e.g., scab, healthy, leaf_mold)

    This structure allows for easy training of both large, multi-crop models and specialized, crop-specific submodels.

    Source Datasets & Loading

    This dataset was created by merging the following public sources:

    • PlantVillage Dataset
    • PlantDoc Dataset
    • PlantWild Dataset
    • PlantSeg Dataset

    A comprehensive cleaning process was applied to merge duplicate/synonymous disease folders and standardize all folder names.

    The associated "starter notebook" provides the essential MultiCropDiseaseDataset class required to easily load this complex, multi-crop structure. This class correctly parses the folders and returns three items for each image (image, crop_label, and disease_label) for direct use in a PyTorch DataLoader.

    Note on the other_crops Folder

    You will find a folder named other_crops. This folder contains a wide variety of other plants (e.g., bean_halo_blight, rice_blast) from the source datasets. The MultiCropDiseaseDataset class in the starter notebook will load this folder along with all other crops, treating other_crops as its own distinct category.

  3. Z

    Data from: Aircraft Marshaling Signals Dataset of FMCW Radar and Event-Based...

    • data.niaid.nih.gov
    Updated Dec 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leon Müller; Manolis Sifalakis; Sherif Eissa; Amirreza Yousefzadeh; Sander Stuijk; Federico Corradi; Paul Detterer (2023). Aircraft Marshaling Signals Dataset of FMCW Radar and Event-Based Camera for Sensor Fusion [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7656910
    Explore at:
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    IMEC
    Eindhoven University of Technology
    Authors
    Leon Müller; Manolis Sifalakis; Sherif Eissa; Amirreza Yousefzadeh; Sander Stuijk; Federico Corradi; Paul Detterer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Introduction The advent of neural networks capable of learning salient features from variance in the radar data has expanded the breadth of radar applications, often as an alternative sensor or a complementary modality to camera vision. Gesture recognition for command control is arguably the most commonly explored application. Nevertheless, more suitable benchmarking datasets than currently available are needed to assess and compare the merits of the different proposed solutions and explore a broader range of scenarios than simple hand-gesturing a few centimeters away from a radar transmitter/receiver. Most current publicly available radar datasets used in gesture recognition provide limited diversity, do not provide access to raw ADC data, and are not significantly challenging. To address these shortcomings, we created and make available a new dataset that combines FMCW radar and dynamic vision camera of 10 aircraft marshalling signals (whole body) at several distances and angles from the sensors, recorded from 13 people. The two modalities are hardware synchronized using the radar's PRI signal. Moreover, in the supporting publication we propose a sparse encoding of the time domain (ADC) signals that achieve a dramatic data rate reduction (>76%) while retaining the efficacy of the downstream FFT processing (<2% accuracy loss on recognition tasks), and can be used to create an sparse event-based representation of the radar data. In this way the dataset can be used as a two-modality neuromorphic dataset. Synchronization of the two modalities The PRI pulses from the radar have been hard-wired to the event stream of the DVS sensor, and timestamped using the DVS clock. Based on this signal the DVS event stream has been segmented such that groups of events (time-bins) of the DVS are mapped with individual radar pulses (chirps). Data storage DVS events (x,y coords and timestamps) are stored in structured arrays, and one such structured array object is associated with the data of a radar transmission (pulse/chirp). A radar transmission is a vector of 512 ADC levels that correspond to sampling points of chirping signal (FMCW radar) that lasts about ~1.3ms. Every 192 radar transmissions are stacked in a matrix called a radar frame (each transmission is a row in that matrix). A data capture (recording) consisting of some thousands of continuous radar transmissions is therefore segmented in a number of radar frames. Finally radar frames and the corresponding DVS structured arrays are stored in separate containers in a custom-made multi-container file format (extension .rad). We provide a (rad file) parser for extracting the data out of these files. There is one file per capture of continuous gesture recording of about 10s. Note the number of 192 transmissions per radar frame is an ad-hoc segmentation that suits the purpose of obtaining sufficient signal resolution in a 2D FFT typical in radar signal processing, for the range resolution of the specific radar. It also served the purpose of fast streaming storing of the data during capture. For extracting individual data points for the dataset however, one can pool together (concat) all the radar frames from a single capture file and re-segment them according to liking. The data loader that we provide offers this, with a default of re-segmenting every 769 transmissions (about 1s of gesturing). Data captures directory organization (radar8Ghz-DVS-marshaling_signals_20220901_publication_anonymized.7z) The dataset captures (recordings) are organized in a common directory structure which encompasses additional metadata information about the captures. dataset_dir///--/ofxRadar8Ghz_yyyy-mm-dd_HH-MM-SS.rad Identifiers

    stage [train, test]. room: [conference_room, foyer, open_space]. subject: [0-9]. Note that 0 stands for no person, and 1 for an unlabeled, random person (only present in test). gesture: ['none', 'emergency_stop', 'move_ahead', 'move_back_v1', 'move_back_v2', 'slow_down' 'start_engines', 'stop_engines', 'straight_ahead', 'turn_left', 'turn_right']. distance: 'xxx', '100', '150', '200', '250', '300', '350', '400', '450'. Note that xxx is used for none gestures when there is no person present in front of the radar (i.e. background samples), or when a person is walking in front of the radar with varying distances but performing no gesture. The test data captures contain both subjects that appear in the train data as well as previously unseen subjects. Similarly the test data contain captures from the spaces that train data were recorded at, as well as from a new unseen open space. Files List radar8Ghz-DVS-marshaling_signals_20220901_publication_anonymized.7z This is the actual archive bundle with the data captures (recordings). rad_file_parser_2.py Parser for individual .rad files, which contain capture data. loader.py A convenience PyTorch Dataset loader (partly Tonic compatible). You practically only need this to quick-start if you don't want to delve too much into code reading. When you init a DvsRadarAircraftMarshallingSignals class object it automatically downloads the dataset archive and the .rad file parser, unpacks the archive, and imports the .rad parser to load the data. One can then request from it a training set, a validation set and a test set as torch.Datasets to work with.
    aircraft_marshalling_signals_howto.ipynb Jupyter notebook for exemplary basic use of loader.py Contact For further information or questions try contacting first M. Sifalakis or F. Corradi.

  4. timm-0.6.7-py3

    • kaggle.com
    zip
    Updated Aug 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleksandr Snorkin (2022). timm-0.6.7-py3 [Dataset]. https://www.kaggle.com/datasets/parapapapam/timm067py3
    Explore at:
    zip(495953 bytes)Available download formats
    Dataset updated
    Aug 11, 2022
    Authors
    Aleksandr Snorkin
    Description

    Pytorch Image Models (timm)

    timm is a deep-learning library created by Ross Wightman and is a collection of SOTA computer vision models, layers, utilities, optimizers, schedulers, data-loaders, augmentations and also training/validating scripts with ability to reproduce ImageNet training results.

    https://github.com/rwightman/pytorch-image-models

    # Installation
    !python -m pip install /kaggle/input/timm067py3/timm-0.6.7-py3-none-any.whl
    
  5. Dataset

    • kaggle.com
    zip
    Updated Nov 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Fowler (2025). Dataset [Dataset]. https://www.kaggle.com/datasets/vacantdaniel/dataset
    Explore at:
    zip(41539094 bytes)Available download formats
    Dataset updated
    Nov 10, 2025
    Authors
    Daniel Fowler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ECG Image Digitization - Deep Learning Solution

    Overview

    This dataset contains a complete deep learning pipeline for digitizing ECG images into time series data for the PhysioNet ECG Image Digitization Challenge. The solution extracts 12-lead ECG signals from degraded images including scans, photos, and physically damaged printouts.

    Model Architecture

    Encoder-Decoder Design

    • Encoder: ResNet-inspired CNN with residual connections

      • Progressive channel expansion: 64 → 128 → 256 → 512
      • Extracts robust spatial features from preprocessed ECG images
      • Handles rotation, noise, and physical artifacts
    • Decoder: Bidirectional LSTM with attention

      • 3-layer BiLSTM with 512 hidden units per direction
      • 12 independent prediction heads for lead-specific reconstruction
      • Dynamic sequence length: 10s for Lead II, 2.5s for other leads

    Preprocessing Pipeline

    The solution includes comprehensive image enhancement:

    1. Denoising: Non-local means filtering removes scanning artifacts
    2. Rotation Correction: Hough transform-based automatic alignment
    3. Contrast Enhancement: CLAHE (Contrast Limited Adaptive Histogram Equalization)
    4. Normalization: ImageNet-style standardization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    5. Resizing: Standardized to 512×1024 pixels for consistent processing

    Training Strategy

    Optimization

    • Loss Function: Mean Squared Error (MSE) between predicted and ground truth signals
    • Optimizer: AdamW with weight decay (1e-5) for regularization
    • Learning Rate: 1e-3 with ReduceLROnPlateau scheduler (patience=5, factor=0.5)
    • Batch Size: 4 (optimized for memory efficiency)
    • Epochs: 50 with early stopping based on validation SNR

    Data Augmentation

    Training robustness achieved through realistic augmentations: - Gaussian noise injection (simulates scanning noise) - Motion and Gaussian blur (simulates camera shake) - Grid distortion (simulates paper warping) - Random brightness/contrast adjustments - Elastic transformations (simulates physical deformation)

    Validation Strategy

    • 90/10 train-validation split
    • Modified SNR metric for evaluation
    • Time alignment: ±0.2 second tolerance
    • Vertical alignment: Automatic offset correction

    Advanced Features

    Ensemble Methods

    • Weighted averaging of multiple models
    • Stacking ensemble support for improved accuracy
    • Configurable ensemble weights

    Post-Processing

    • Wavelet denoising for signal smoothing
    • Savitzky-Golay filtering
    • Baseline drift correction
    • ECG lead relationship constraints (Einthoven's and Goldberger's laws)

    Transfer Learning

    • Pre-trained ResNet and EfficientNet backbones from ImageNet
    • Fine-tuning on ECG-specific features
    • Faster convergence and better generalization

    Cross-Validation

    • K-fold validation framework (default k=5)
    • Stratified splits for robust evaluation
    • Performance metrics aggregated across folds

    Key Capabilities

    Artifact Handling

    • Rotation Errors: Automatic detection and correction up to ±45 degrees
    • Scanning Noise: Robust denoising without signal degradation
    • Physical Damage: Resilient to stains, tears, and occlusions
    • Variable Quality: Handles both high-quality scans and low-quality photos

    Multi-Lead Consistency

    • Independent prediction heads maintain lead-specific characteristics
    • Enforces physiological relationships between leads
    • Consistent temporal alignment across all leads

    Signal Quality Metrics

    • Modified SNR calculation with automatic alignment
    • Lead-specific performance tracking
    • Comprehensive validation reporting

    Dataset Contents

    Core Implementation Files

    • model.py: Neural network architecture (ResNet encoder + LSTM decoder)
    • preprocessing.py: Image preprocessing and augmentation pipeline
    • dataset.py: PyTorch data loaders for training and inference
    • metrics.py: Modified SNR evaluation metric
    • train.py: Complete training loop with checkpointing
    • inference.py: Batch prediction and submission generation
    • app.py: Streamlit interactive visualization interface

    Advanced Features

    • ensemble.py: Multi-model ensemble methods
    • postprocessing.py: Signal refinement and constraint enforcement
    • transfer_learning.py: Pre-trained backbone integration
    • cross_validation.py: K-fold validation framework
    • hyperparameter_tuning.py: Automated grid search

    Utilities

    • config.py: Centralized configuration management
    • utils.py: Helper functions for data handling
    • kaggle_submission_standalone.py: Self-contained Kaggle notebook

    Performance Characteristics

    Training Performance

    • Training Time: 2-4 hours on GPU (50 epochs, full dataset)
    • Memory Usage: ~6GB GPU memory (batch size 4)
    • Convergence: Typically reaches optimal SNR within 30-40 epochs

    Inference Performance

    • Speed: ~1-2 sec...
  6. Z

    Immobilized fluorescently stained zebrafish through the eXtended Field of...

    • data.niaid.nih.gov
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Page Vizcaíno, Josué; Symvoulidis, Panagiotis; Wang, Zeguan; Jelten, Jonas; Favaro, Paolo; Boyden, Edward S.; Lasser, Tobias (2024). Immobilized fluorescently stained zebrafish through the eXtended Field of view Light Field Microscope 2D-3D dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8024695
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Synthetic Neurobiology Group, Massachusetts Institute of Technology, USA
    Computer Vision Group, University of Bern, Switzerland
    Computational Imaging and Inverse Problems, Department of Informatics, School of Computation, Information and Technology, Technical University of Munich, Germany 2Munich Institute of Biomedical Engineering, Technical University of Munich, Germany
    Authors
    Page Vizcaíno, Josué; Symvoulidis, Panagiotis; Wang, Zeguan; Jelten, Jonas; Favaro, Paolo; Boyden, Edward S.; Lasser, Tobias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Immobilized fluorescently stained zebrafish through the eXtended Field of view Light Field Microscope 2D-3D dataset

    This dataset comprises three immobilized fluorescently stained zebrafish imaged through the eXtended Field of view Light Field Microscope (XLFM, also known as Fourier Light Field Microscope). The images were preprocessed with the SLNet, which extracts the sparse signals from the images (a.k.a. the neural activity).

    If you intend to use this with Pytorch, you can find a data loader and working source code to load and train networks here.

    This dataset is part of the publication: Fast light-field 3D microscopy with out-of-distribution detection and adaptation through Conditional Normalizing Flows.

    The fish present are:

    1x NLS GCaMP6s

    1x Pan-neuronal nuclear localized GCaMP6s Tg(HuC:H2B:GCaMP6s)

    1x Soma localized GCaMP7f Tg(HuC:somaGCaMP7f)

    The dataset is structured as follows::

    XLFM_dataset

    Dataset/

    GCaMP6s_NLS_1/

    SLNet_preprocessed/

    XLFM_image/

    XLFM_image_stack.tif: tif stack of 600 preprocessed XLFM images.

    XLFM_stack/

    XLFM_stack_nnn.tif: 3D stack corresponding to frame nnn.

    Neural_activity_coordinates.csv: 3D coordinates of neurons found through the suite2p framework.

    Raw/

    XLFM_image/

    XLFM_image_stack.tif: tif stack of 600 raw XLFM images.

    (other samples)

    lenslet_centers_python.txt: 2D coordinates of the lenset in the XLFM images.

    PSF_241depths_16bit.tif: 3D PSF of the microscope can be used for 3D deconvolution. Spanning 734 × 734 × 550𝜇𝑚3 used to deconvolve this volumes.

    In this dataset, we provide a subset of the images and volumes.

    Due to space constraints, we provide the 3D volumes only for:

    SLNet_preprocessed/XLFM_stack/

    10 interleaved frames between frames 0-499 (can be used for training a network).

    20 consecutive frames, 500-520 (can be used for testing).

    raw/

    No volumes are provided for raw data, but they can be reconstructed through 3D deconvolution.

    Enjoy, and feel free to contact us for any information request, like the full PSF, 3 more samples or longer image sequences.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yacine Bouaouni (2023). rsna-mammography-768-vl-perlabel [Dataset]. https://www.kaggle.com/datasets/jarvisai7/rsna-mammography-768-vl-perlabel
Organization logo

rsna-mammography-768-vl-perlabel

Ready to use PNG files for Keras and Pytorch data loaders using label inferring

Explore at:
zip(8268291887 bytes)Available download formats
Dataset updated
Feb 12, 2023
Authors
Yacine Bouaouni
Description

This dataset contains images in png format for RSNA Screening Mammography Breast Cancer Detection competition. All the images have a 768x768 size and are organized in two folders (one for each label). This allows inferring the labels from the folders (which might be helpful when using keras image_dataset_from_directory for example). - The processing of the DICOM files was done in this notebook by (Radek Osmulski ): https://www.kaggle.com/code/radek1/how-to-process-dicom-images-to-pngs?scriptVersionId=113529850 - The contribution made in this dataset is to organize the files in a way that enables label inferring from the directory name.

🎉If this dataset helps you in your work don't hesitate to upvote 🎉

Search
Clear search
Close search
Google apps
Main menu