12 datasets found
  1. Z

    COCO dataset and neural network weights for micro-FTIR particle detection on...

    • data.niaid.nih.gov
    Updated Aug 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schowing, Thibault (2024). COCO dataset and neural network weights for micro-FTIR particle detection on filters. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10839526
    Explore at:
    Dataset updated
    Aug 13, 2024
    Dataset authored and provided by
    Schowing, Thibault
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The IMPTOX project has received funding from the EU's H2020 framework programme for research and innovation under grant agreement n. 965173. Imptox is part of the European MNP cluster on human health.

    More information about the project here.

    Description: This repository includes the trained weights and a custom COCO-formatted dataset used for developing and testing a Faster R-CNN R_50_FPN_3x object detector, specifically designed to identify particles in micro-FTIR filter images.

    Contents:

    Weights File (neuralNetWeights_V3.pth):

    Format: .pth

    Description: This file contains the trained weights for a Faster R-CNN model with a ResNet-50 backbone and a Feature Pyramid Network (FPN), trained for 3x schedule. These weights are specifically tuned for detecting particles in micro-FTIR filter images.

    Custom COCO Dataset (uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip):

    Format: .zip

    Description: This zip archive contains a custom COCO-formatted dataset, including JPEG images and their corresponding annotation file. The dataset consists of images of micro-FTIR filters with annotated particles.

    Contents:

    Images: JPEG format images of micro-FTIR filters.

    Annotations: A JSON file in COCO format providing detailed annotations of the particles in the images.

    Management: The dataset can be managed and manipulated using the Pycocotools library, facilitating easy integration with existing COCO tools and workflows.

    Applications: The provided weights and dataset are intended for researchers and practitioners in the field of microscopy and particle detection. The dataset and model can be used for further training, validation, and fine-tuning of object detection models in similar domains.

    Usage Notes:

    The neuralNetWeights_V3.pth file should be loaded into a PyTorch model compatible with the Faster R-CNN architecture, such as Detectron2.

    The contents of uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip should be extracted and can be used with any COCO-compatible object detection framework for training and evaluation purposes.

    Code can be found on the related Github repository.

  2. MatterPort Mask R-CNN for Here We Grow!

    • kaggle.com
    Updated Mar 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kyle Stargarden (2020). MatterPort Mask R-CNN for Here We Grow! [Dataset]. https://www.kaggle.com/stargarden/matterhorn-mask-rcnn-for-here-we-grow/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 15, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kyle Stargarden
    Description

    Context

    Here We Grow is a community learning project for the SingularityNet blockchain.

    Our group set out to synthesize labeled data for computer vision tasks. Adam Kelly, from Immersive Limit, found a great way of doing this: Microsoft's Common Objects in Context's annotations (Coco), MatterPort's Mask R-CNN for image segmentation and the GIMP photo editor. This dataset is for a Kaggle implementation of the Udemy course materials from "Complete Guide to Creating COCO Datasets" by Adam Kelly.

    Course can be found here: https://www.immersivelimit.com/courses

    Content

    The MatterPort Mask R-CNN github repository: https://github.com/matterport/Mask_RCNN CocoSynth data by Adam Kelly : https://www.kaggle.com/stargarden/cocosynth-for-here-we-grow

    These notebooks are from the course: Training and inference notebook : https://www.kaggle.com/stargarden/m-r-cnn-matterport-1 Coco image viewer notebook: https://www.kaggle.com/stargarden/coco-image-viewer

    Acknowledgements

    Almost all of this code is borrowed from Udemy course - "Complete Guide to Creating COCO Datasets" Immersive Limit and MatterPort's mask RCNN

    https://github.com/matterport/Mask_RCNN/blob/master/LICENSE

    Inspiration

    Hopefully this can hyper charge some deep learning projects and contests. The ability to bootstrap extra images from the original dataset is pretty powerful.

  3. f

    Segmentation comparisons on COCO stuff dataset.

    • figshare.com
    xls
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiuyuan Lei; Fei Lu (2024). Segmentation comparisons on COCO stuff dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0295263.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 14, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Qiuyuan Lei; Fei Lu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Most semantic segmentation works have obtained accurate segmentation results through exploring the contextual dependencies. However, there are several major limitations that need further investigation. For example, most approaches rarely distinguish different types of contextual dependencies, which may pollute the scene understanding. Moreover, local convolutions are commonly used in deep learning models to learn attention and capture local patterns in the data. These convolutions operate on a small neighborhood of the input, focusing on nearby information and disregarding global structural patterns. To address these concerns, we propose a Global Domain Adaptation Attention with Data-Dependent Regulator (GDAAR) method to explore the contextual dependencies. Specifically, to effectively capture both the global distribution information and local appearance details, we suggest using a stacked relation approach. This involves incorporating the feature node itself and its pairwise affinities with all other feature nodes within the network, arranged in raster scan order. By doing so, we can learn a global domain adaptation attention mechanism. Meanwhile, to improve the features similarity belonging to the same segment region while keeping the discriminative power of features belonging to different segments, we design a data-dependent regulator to adjust the global domain adaptation attention on the feature map during inference. Extensive ablation studies demonstrate that our GDAAR better captures the global distribution information for the contextual dependencies and achieves the state-of-the-art performance on several popular benchmarks.

  4. Chicken Object Detection/Segmentation

    • kaggle.com
    Updated Nov 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hayri (2022). Chicken Object Detection/Segmentation [Dataset]. https://www.kaggle.com/datasets/hayriyigit/chicken-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 11, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    hayri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Motivation: I have to count all the chickens every day to check if they are all in the hencoop. As it's hard to count them in the nest, I decided to count them using CV. I will be placing the camera above the door, thus most of the photos are taken from an overhead angle.

    You can find the annotation file, named _annotations.coco.json, in the train folder.

    I did not split the dataset here and the annotation format is COCO, but you can find the other versions and formats here: https://universe.roboflow.com/training-kuvo9/chicken-counter

    Dataset: Size: 375 Image shape: 640x640x3 Annotation format: COCO (JSON)

  5. t

    van Geffen, Femke, Brieger, Frederic, Pestryakova, Luidmila A, Zakharov,...

    • service.tib.eu
    Updated Nov 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). van Geffen, Femke, Brieger, Frederic, Pestryakova, Luidmila A, Zakharov, Evgenii S, Herzschuh, Ulrike, Kruse, Stefan (2021). Dataset: SiDroForest: Synthetic Siberian Larch Tree Crown Dataset of 10.000 instances in the Microsoft's Common Objects in Context dataset (coco) format. https://doi.org/10.1594/PANGAEA.932795 [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-932795
    Explore at:
    Dataset updated
    Nov 30, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This synthetic Siberian Larch tree crown dataset was created for upscaling and machine learning purposes as a part of the SiDroForest (Siberia Drone Forest Inventory) project. The SiDroForest data collection (https://www.pangaea.de/?q=keyword%3A%22SiDroForest%22) consists of vegetation plots covered in Siberia during a 2-month fieldwork expedition in 2018 by the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research in Germany. During fieldwork fifty-six, 50*50-meter vegetation plots were covered by Unmanned Aerial Vehicle (UAV) flights and Red Green Blue (RGB) and Red Green Near Infrared (RGNIR) photographs were taken with a consumer grade DJI Phantom 4 quadcopter. The synthetic dataset provided here contains Larch (Larix gmelinii (Rupr.) Rupr. and Larix cajanderi Mayr.) tree crowns extracted from the onboard camera RGB UAV images of five selected vegetation plots from this expedition, placed on top of full-resized images from the same RGB flights. The extracted tree crowns have been rotated, rescaled and repositioned across the images with the result of a diverse synthetic dataset that contains 10.000 images for training purposes and 2000 images for validation purposes for complex machine learning neural networks. In addition, the data is saved in the Microsoft's Common Objects in Context dataset (COCO) format (Lin et al.,2013) and can be easily loaded as a dataset for networks such as the Mask R-CNN, U-Nets or the Faster R-NN. These are neural networks for instance segmentation tasks that have become more frequently used over the years for forest monitoring purposes. The images included in this dataset are from the field plots: EN18062 (62.17° N 127.81° E), EN18068 (63.07° N 117.98° E), EN18074 (62.22° N 117.02° E), EN18078 (61.57° N 114.29° E), EN18083 (59.97° N 113° E), located in Central Yakutia, Siberia. These sites were selected based on their vegetation content, their spectral differences in color as well as UAV flight angles and the clarity of the UAV images that were taken with automatic shutter and white balancing (Brieger et al. 2019). From each site 35 images were selected in order of acquisition, starting at the fifteenth image in the flight to make up the backgrounds for the dataset. The first fifteen images were excluded because they often contain a visual representation of the research team. The 117 tree crowns were manually cut out in Gimp software to ensure that they were all Larix trees.Of the tree crowns,15% were included that are at the margin of the image to make sure that the algorithm does not rely on a full tree crown in order to detect a tree. As a background image for the extracted tree crowns, 35 raw UAV images for each of the five sites were selected were included. The images were selected based on their content. In some of the UAV images, the research teams are visible and those have been excluded from this dataset. The five sites were selected based on their spectral diversity, and their vegetation content. The raw UAV images were cropped to 640 by 480 pixels at a resolution of 72 dpi. These are later rescaled to 448 by 448 pixels in the process of the dataset creation. In total there were 175 cropped backgrounds. The synthetic images and their corresponding annotations and masks were created using the cocosynth python software provided by Adam Kelly (2019). The software is open source and available on GitHub: https://github.com/akTwelve/cocosynth. The software takes the tree crowns and rescales and transform them before placing up to three tree crowns on the backgrounds that were provided. The software also creates matching masks that are used by instance segmentation and object detection algorithms to learn the shapes and location of the synthetic crown. COCO annotation files with information about the crowns name and label are also generated. This format can be loaded into a variety of neural networks for training purposes.

  6. River Water Segmentation Dataset (RIWA)

    • kaggle.com
    Updated Jan 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franz Wagner (2023). River Water Segmentation Dataset (RIWA) [Dataset]. http://doi.org/10.34740/kaggle/dsv/4901781
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 26, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Franz Wagner
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    River Water Segmentation Dataset (RIWA)

    New Version 2: It is the largest high quality (min size of 400x400) dataset as far as we know (01/2023).

    The dataset called RIWA represents a pixel-wise binary river water segmentation. It consist of manually labelled smartphone, drone and DSLR images of rivers as well as suiting images of the Water Segmentation Dataset and high quality AED20K images. The COCO dataset was withdrawn since the segmentation quality is extremely poor.

    Version hisoty:

    Version 2: (declared as Version 4 by kaggle) - contains 1142 training, 167 validation and 323 test images. - Min size: 400 x 400 (h x w) - High quality segmentations. If you find an error, please message us.

    Version 1: - contains 789 training, 228 validation and 111 test images. - Min size: 174 x 200 (hxw) - Some segmentations are not perfect.

    Citation

    If you use this dataset, please cite as:

     @misc{RIWA_Dataset,
      title={River Water Segmentation Dataset (RIWA)},
      url={https://www.kaggle.com/dsv/4901781},
      DOI={10.34740/KAGGLE/DSV/4901781},
      publisher={Kaggle},
      author={Xabier Blanch and Franz Wagner and Anette Eltner},
      year={2023}
    }
    

    Contact: - Xabier Blanch, TU Dresden see at SCIENTIFIC STAFF - Franz Wagner, TU Dresden - Anette Eltner, TU Dresden

    CNN comparison

    In 2023, we carried out a comparison to find the best CNN on this domain. If you are interested, please see our paper: River water segmentation in surveillance camera images: A comparative study of offline and online augmentation using 32 CNNs.

    We conducted the tests using the AiSeg GitLab repository. It is capable of interactively train 2D and 3D CNNs, augmenting data with offline and online augmentation, analyzing single networks, comparing multiple networks, and applying trained CNNs to new data. The RIWA dataset can be used directly.

    Background:

    The handling of natural disasters, especially heavy rainfall and corresponding floods, requires special demands on emergency services. The need to obtain a quick, efficient and real-time estimation of the water level is critical for monitoring a flood event. This is a challenging task and usually requires specially prepared river sections. In addition, in heavy flood events, some classical observation methods may be compromised.

    With the technological advances derived from image-based observation methods and segmentation algorithms based on neural networks (NN), it is possible to generate real-time, low-cost monitoring systems. This new approach makes it possible to densify the observation network, improving flood warning and management. In addition, images can be obtained by remotely positioned cameras, preventing data loss during a major event.

    The workflow we have developed for real-time monitoring consists of the integration of 3 different techniques. The first step consists of a topographic survey using Structure from Motion (SfM) strategies. In this stage, images of the area of interest are obtained using both terrestrial cameras and UAV images. The survey is completed by obtaining ground control point coordinates with multi-band GNSS equipment. The result is a 3D SfM model georeferenced to centimetre accuracy that allows us to reconstruct not only the river environment but also the riverbed.

    The second step consists of segmenting the images obtained with a surveillance camera installed ad hoc to monitor the river. This segmentation is achieved with the use of convolutional neural networks (CNN). The aim is to automatically segment the time-lapse images obtained every 15 minutes. We have carried out this research by testing different CNN to choose the most suitable structure for river segmentation, adapted to each study area and at each time of the day (day and night).

    The third step is based on the integration between the automatically segmented images and the 3D model acquired. The CNN-segmented river boundary is projected into the 3D SfM model to obtain a metric result of the water level based on the point of the 3D model closest to the image ray.

    The possibility of automating the segmentation and reprojection in the 3D model will allow the generation of a robust centimetre-accurate workflow, capable of estimating the water level in near real time both day and night. This strategy represents the basis for a better understanding of river flo...

  7. iSAID Dataset

    • kaggle.com
    Updated Jan 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tensor Girl (2021). iSAID Dataset [Dataset]. https://www.kaggle.com/usharengaraju/isaid-dataset/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tensor Girl
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Context

    Existing Earth Vision datasets are either suitable for semantic segmentation or object detection. iSAID is the first benchmark dataset for instance segmentation in aerial images. This large-scale and densely annotated dataset contains 655,451 object instances for 15 categories across 2,806 high-resolution images. The distinctive characteristics of iSAID are the following: (a) large number of images with high spatial resolution, (b) fifteen important and commonly occurring categories, (c) large number of instances per category, (d) large count of labelled instances per image, which might help in learning contextual information, (e) huge object scale variation, containing small, medium and large objects, often within the same image, (f) Imbalanced and uneven distribution of objects with varying orientation within images, depicting real-life aerial conditions, (g) several small size objects, with ambiguous appearance, can only be resolved with contextual reasoning, (h) precise instance-level annotations carried out by professional annotators, cross-checked and validated by expert annotators complying with well-defined guidelines.

    Content

    The images of iSAID is the same as the DOTA-v1.0 dataset, which are manily collected from the Google Earth, some are taken by satellite JL-1, the others are taken by satellite GF-2 of the China Centre for Resources Satellite Data and Application.

    Use of the images from Google Earth must respect the corresponding terms of use: "Google Earth" terms of use.

    All images and their associated annotations in iSAID can be used for academic purposes only, but any commercial use is prohibited.

    Object Category The object categories in iSAID include: plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field and swimming pool.

    Annotation format The iSAID uses pixel-level annotations. Each pixel represents a particular class. The annotation follows the format of MS COCO.

    Acknowledgements

    @inproceedings{waqas2019isaid, title={iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images}, author={Waqas Zamir, Syed and Arora, Aditya and Gupta, Akshita and Khan, Salman and Sun, Guolei and Shahbaz Khan, Fahad and Zhu, Fan and Shao, Ling and Xia, Gui-Song and Bai, Xiang}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops}, pages={28--37}, year={2019} }

    @InProceedings{Xia_2018_CVPR, author = {Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei}, title = {DOTA: A Large-Scale Dataset for Object Detection in Aerial Images}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2018} }

  8. Z

    SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Costa, Maria (2024). SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and Visual-inertial Localization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10053564
    Explore at:
    Dataset updated
    Jul 5, 2024
    Dataset provided by
    Ribeiro Marnet, Luiza
    Antal, László
    Aubard, Martin
    Costa, Maria
    Álvarez-Tuñón, Olaya
    Brodskiy, Yury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This paper presents SubPipe, an underwater dataset for SLAM, object detection, and image segmentation. SubPipe has been recorded using a lightweight autonomous underwater vehicle (LAUV), operated by OceanScan MST, and carrying a sensor suite including two cameras, a side-scan sonar, and an inertial navigation system, among other sensors. The AUV has been deployed in a pipeline inspection environment with a submarine pipe partially covered by sand. The AUV's pose ground truth is estimated from the navigation sensors. The side-scan sonar and RGB images include object detection and segmentation annotations, respectively. State-of-the-art segmentation, object detection, and SLAM methods are benchmarked on SubPipe to demonstrate the dataset's challenges and opportunities for leveraging computer vision algorithms.To the authors' knowledge, this is the first annotated underwater dataset providing a real pipeline inspection scenario. The dataset and experiments are publicly available online.

    On Zenodo we provide three versions for SubPipe. One is the full version (SubPipe.zip, ~80GB unzipped) and two subsamples: SubPipeMini.zip, ~12GB unzipped and SubPipeMini2.zip, ~16GB unzipped. Both subsamples are only parts of the entire dataset (SubPipe.zip). SubPipeMini is a subset, containing semantic segmentation data, and it has interesting camera data of the underwater pipeline. On the other hand, SubPipeMini2 is mainly focused on underwater side-scan sonar images of the seabed including ground truth object detection bounding boxes of the pipeline.

    For (re-)using/publishing SubPipe, please include the following copyright text:

    SubPipe is a public dataset of a submarine outfall pipeline, property of Oceanscan-MST. This dataset was acquired with a Light Autonomous Underwater Vehicle by Oceanscan-MST, within the scope of Challenge Camp 1 of the H2020 REMARO project.

    More information about OceanScan-MST can be found at this link.

    Cam0 — GoPro Hero 10

    Camera parameters:

    Resolution: 1520×2704

    fx = 1612.36

    fy = 1622.56

    cx = 1365.43

    cy = 741.27

    k1,k2, p1, p2 = [−0.247, 0.0869, −0.006, 0.001]

    Side-scan Sonars

    Each sonar image was created after 20 “ping” (after every 20 new lines) which corresponds to approx. ~1 image / second.

    Regarding the object detection annotations, we provide both COCO and YOLO formats for each annotation. A single COCO annotation file is provided per each chunk and per each frequency (low frequency vs. high frequency), whereas the YOLO annotations are provided for each SSS image file.

    Metadata about the side-scan sonar images contained in this dataset:

    Images for object detection

    Low Frequency (LF):

    5000

    LF image size: 2500 × 500

    High Frequency (HF):

    5030

    HF Image size 5000 × 500

    Total number of images: 10030

    Annotations

    Low Frequency:

    3163

    High Frequency:

    3172

    Total number of annotations: 6335

  9. Hyperspectral Imaging Dataset for Laser Thermal Ablation Monitoring in Vital...

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Dec 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Viacheslav Danilov; Viacheslav Danilov; Martina De Landro; Martina De Landro; Paola Saccomandi; Paola Saccomandi (2024). Hyperspectral Imaging Dataset for Laser Thermal Ablation Monitoring in Vital Organs [Dataset]. http://doi.org/10.5281/zenodo.10444213
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Dec 14, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Viacheslav Danilov; Viacheslav Danilov; Martina De Landro; Martina De Landro; Paola Saccomandi; Paola Saccomandi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Objectives: The objective of the research was to use hyperspectral imaging (HSI) to detect thermal damage induced in vital organs (such as the liver, pancreas, and stomach) during laser thermal therapy. The experimental study was conducted during thermal ablation procedures on live pigs.

    Ethical Approval: The experiments were performed at the Institute for Image Guided Surgery in Strasbourg, France. This experimental study was approved by the local Ethical Committee on Animal Experimentation (ICOMETH No. 38.2015.01.069) and by the French Ministry of Higher Education and Research (protocol №APAFiS-19543-2019030112087889, approved on March 14, 2019). All animals were treated in accordance with the ARRIVE guidelines, the French legislation on the use and care of animals, and the guidelines of the Council of the European Union (2010/63/EU).

    Description: During our experimental study, we used a TIVITA hyperspectral camera to acquire hypercubes of size 640x480x100 voxels, indicating 640x480 pixels for 100 bands, and regular RGB images at each acquisition step. These bands were acquired directly from the hyperspectral camera without additional pre-processing. The hypercube was acquired in approximately 6 seconds and synchronized with the absence of breathing motion using a protocol implemented for animal anesthesia. Polyurethane markers were placed around the target area to serve as references for superimposing the hyperspectral images, which were acquired using target areas selected according to the hyperspectral camera manufacturer's guidelines.

    As part of our investigation, we included hyperspectral cubes from 20 experiments conducted under identical conditions in our study. The hyperspectral cubes were collected in three distinct stages. In the first stage, the cubes were gathered before laparotomy at a temperature of 37°C. In the second stage, we obtained the cubes as the temperature gradually increased from 60°C to 110°C at 10°C intervals. Finally, in the last stage, the cubes were collected after turning off the laser during the post-ablation phase. Thus, we obtained a total of 233 hyperspectral cubes, each consisting of 100 wavelengths, resulting in a dataset of 23,300 two-dimensional images. The temperature changes were recorded, and the “Temperature profile during laser ablation” image illustrates the corresponding profile, highlighting the specific time intervals during which the hyperspectral camera and laser were activated and deactivated. To provide a visual representation of the collected data, we have included several examples of images captured from different organs in the “Examples of ablation areas” figure.

    The raw dataset, comprising 233 hyperspectral cubes of 100 wavelengths each, was transformed into 699 single-channel images using PCA and t-SNE decompositions. These images were then divided into training and test subsets and prepared in the COCO object detection format. This COCO dataset can be used for training and testing different neural networks.

    Access to the Study: Further information about this study, including curated source code, dataset details, and trained models, can be accessed through the following repositories:

  10. ResNet-18

    • kaggle.com
    Updated Dec 12, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PyTorch (2017). ResNet-18 [Dataset]. https://www.kaggle.com/datasets/pytorch/resnet18/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 12, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PyTorch
    Description

    ResNet-18

    Deep Residual Learning for Image Recognition

    Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity.

    An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.

    The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

    Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
    https://arxiv.org/abs/1512.03385

    Architecture visualization: http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006

    https://imgur.com/nyYh5xH.jpg" alt="Resnet">

    What is a Pre-trained Model?

    A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.

    Why use a Pre-trained Model?

    Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.

  11. SuperAnimal-Quadruped-80K

    • zenodo.org
    application/gzip
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). SuperAnimal-Quadruped-80K [Dataset]. http://doi.org/10.5281/zenodo.14016777
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Time period covered
    Jun 9, 2024
    Description

    Introduction

    This dataset supports Ye et al. 2024 Nature Communications. Please cite this dataset and paper if you use this resource. Please also see Ye et al. 2024 for the full DataSheet that accompanies this download, including the meta data for how to use this data is you want to compare model results on benchmark tasks. Below is just a summary. Also see the dataset licensing below.

    Training Data

    It consists of being trained together on the following datasets:

    • AwA-Pose Quadruped dataset, see full details at (1).
    • AnimalPose See full details at (2).
    • AcinoSet See full details at (3).
    • Horse-30 Horse-30 dataset, benchmark task is called Horse-10; See full details at (4).
    • StanfordDogs See full details at (5, 6).
    • AP-10K See full details at (7).
    • iRodent We utilized the iNaturalist API functions for scraping observations with the taxon ID of Suborder Myomorpha (8). The functions allowed us to filter the large amount of observations down to the ones with photos under the CC BY-NC creative license. The most common types of rodents from the collected observations are Muskrat (Ondatra zibethicus), Brown Rat (Rattus norvegicus), House Mouse (Mus musculus), Black Rat (Rattus rattus), Hispid Cotton Rat (Sigmodon hispidus), Meadow Vole (Microtus pennsylvanicus), Bank Vole (Clethrionomys glareolus), Deer Mouse (Peromyscus maniculatus), White-footed Mouse (Peromyscus leucopus), Striped Field Mouse (Apodemus agrarius). We then generated segmentation masks over target animals in the data by processing the media through an algorithm we designed that uses a Mask Region Based Convolutional Neural Networks(Mask R-CNN) (9) model with a ResNet-50-FPN backbone (10), pretrained on the COCO datasets (11). The processed 443 images were then manually labeled with both pose annotations and segmentation masks. iRodent data is banked at https://zenodo.org/record/8250392.
    • APT-36K See full details at (12).

    https://images.squarespace-cdn.com/content/v1/57f6d51c9f74566f55ecf271/1690988780004-AG00N6OU1R21MZ0AU9RE/modelcard-SAQ.png?format=1500w" target="_blank" rel="noopener">Here is an image with a keypoint guide.

    Ethical Considerations

    • No experimental data was collected for this model; all datasets used are cited above.

    Caveats and Recommendations

    • Please note that each dataest was labeled by separate labs & separate individuals, therefore while we map names to a unified pose vocabulary, there will be annotator bias in keypoint placement (See Ye et al. 2024 for our Supplementary Note on annotator bias). You will also note the dataset is highly diverse across species, but collectively has more representation of domesticated animals like dogs, cats, horses, and cattle. We recommend if performance of a model trained on this data is not as good as you need it to be, first try video adaptation (see Ye et al. 2024), or fine-tune the weights with your own labeling.

    License

    Modified MIT.

    Copyright 2023-present by Mackenzie Mathis, Shaokai Ye, and contributors.

    Permission is hereby granted to you (hereafter "LICENSEE") a fully-paid, non-exclusive,
    and non-transferable license for academic, non-commercial purposes only (hereafter “LICENSE”)
    to use the "DATASET" subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all copies or substantial
    portions of the Software:

    This data or resulting software may not be used to harm any animal deliberately.

    LICENSEE acknowledges that the DATASET is a research tool.
    THE DATASET IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
    BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE DATASET OR THE USE OR OTHER DEALINGS IN THE DATASET.

    If this license is not appropriate for your application, please contact Prof. Mackenzie W. Mathis
    (mackenzie@post.harvard.edu) for a commercial use license.

    Please cite Ye et al if you use this DATASET in your work.

    References

    1. Prianka Banik, Lin Li, and Xishuang Dong. A novel dataset for keypoint detection of quadruped animals from images. ArXiv, abs/2108.13958, 2021
    2. Jinkun Cao, Hongyang Tang, Haoshu Fang, Xiaoyong Shen, Cewu Lu, and Yu-Wing Tai. Cross-domain adaptation for animal pose estimation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9497–9506, 2019.
    3. Daniel Joska, Liam Clark, Naoya Muramatsu, Ricardo Jericevich, Fred Nicolls, Alexander Mathis, Mackenzie W. Mathis, and Amir Patel. Acinoset: A 3d pose estimation dataset and baseline models for cheetahs in the wild. 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13901–13908, 2021.
    4. Alexander Mathis, Thomas Biasi, Steffen Schneider, Mert Yuksekgonul, Byron Rogers, Matthias Bethge, and Mackenzie W Mathis. Pretraining boosts out-of-domain robustness for pose estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1859–1868, 2021.
    5. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for fine-grained image categorization. In First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011.
    6. Benjamin Biggs, Thomas Roddick, Andrew Fitzgibbon, and Roberto Cipolla. Creatures great and smal: Recovering the shape and motion of animals from video. In Asian Conference on Computer Vision, pages 3–19. Springer, 2018.
    7. Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, and Dacheng Tao. Ap-10k: A benchmark for animal pose estimation in the wild. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
    8. iNaturalist. OGBIF Occurrence Download. https://doi.org/10.15468/dl.p7nbxt. iNaturalist, July 2020
    9. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
    10. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection, 2016.
    11. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll’ar, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014
    12. Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, and Dacheng Tao. Apt-36k: A large-scale benchmark for animal pose estimation and tracking. Advances in Neural Information Processing Systems, 35:17301–17313, 2022

    Versioning Note:

    - V2 includes fixes to Stanford Dog data; it affected less than 1% of the data.

  12. f

    AHOD: Adaptive Hybrid Object Detector for Context-Awareed Item

    • figshare.com
    json
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Serge AMAN (2025). AHOD: Adaptive Hybrid Object Detector for Context-Awareed Item [Dataset]. http://doi.org/10.6084/m9.figshare.29064287.v2
    Explore at:
    jsonAvailable download formats
    Dataset updated
    May 14, 2025
    Dataset provided by
    figshare
    Authors
    Serge AMAN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We evaluated our AHOD model using two well-known datasets in the field of object detection:COCO (Common Objects in Context)One of the most widely used benchmarks for object detection.Contains over 200,000 images and more than 80 object categories.Includes objects in varied and sometimes cluttered contexts, allowing the robustness of detectors to be evaluated.Pascal VOCAnother reference dataset, often used for classification, detection and segmentation tasks.Includes 20 object categories, with precise bounding box annotations.Less complex than COCO, but useful for comparing performance on more conventional objects.Tools, techniques and innovations usedThe AHOD architecture is based on three main modules:Feature Pyramid Enhancement (FPE)Multi-scale feature processing tool.Improves the representation of objects of various sizes in the same image.Inspired by architectures such as FPN (Feature Pyramid Networks), but optimised for better performance.Dynamic Context Module (DCM)Intelligent contextual module.Capable of dynamically adjusting the extracted features according to the context (e.g. by adapting the features according to urban or rural areas in a road image).Enhances the model's ability to understand the overall context of the scene.Fast and Accurate Detection Head (FADH)Optimised detection head.Seeks a compromise between the speed of YOLO and the accuracy of Faster R-CNN.Probably uses lightweight convolution layers or optimisations such as MobileNet/Depthwise Convolutions.Probable technologies usedAlthough the summary does not specify this, we can reasonably assume that the following tools are used:Deep learning frameworks: PyTorch or TensorFlow, which are standard in object detection research.GPUs for training and inference, particularly for measuring inference times (essential in real-time applications).Standard evaluation techniques:mAP (mean Average Precision): measure of average precision.FPS (Frames Per Second) or inference time for real-time performance.

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Schowing, Thibault (2024). COCO dataset and neural network weights for micro-FTIR particle detection on filters. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10839526

COCO dataset and neural network weights for micro-FTIR particle detection on filters.

Explore at:
Dataset updated
Aug 13, 2024
Dataset authored and provided by
Schowing, Thibault
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The IMPTOX project has received funding from the EU's H2020 framework programme for research and innovation under grant agreement n. 965173. Imptox is part of the European MNP cluster on human health.

More information about the project here.

Description: This repository includes the trained weights and a custom COCO-formatted dataset used for developing and testing a Faster R-CNN R_50_FPN_3x object detector, specifically designed to identify particles in micro-FTIR filter images.

Contents:

Weights File (neuralNetWeights_V3.pth):

Format: .pth

Description: This file contains the trained weights for a Faster R-CNN model with a ResNet-50 backbone and a Feature Pyramid Network (FPN), trained for 3x schedule. These weights are specifically tuned for detecting particles in micro-FTIR filter images.

Custom COCO Dataset (uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip):

Format: .zip

Description: This zip archive contains a custom COCO-formatted dataset, including JPEG images and their corresponding annotation file. The dataset consists of images of micro-FTIR filters with annotated particles.

Contents:

Images: JPEG format images of micro-FTIR filters.

Annotations: A JSON file in COCO format providing detailed annotations of the particles in the images.

Management: The dataset can be managed and manipulated using the Pycocotools library, facilitating easy integration with existing COCO tools and workflows.

Applications: The provided weights and dataset are intended for researchers and practitioners in the field of microscopy and particle detection. The dataset and model can be used for further training, validation, and fine-tuning of object detection models in similar domains.

Usage Notes:

The neuralNetWeights_V3.pth file should be loaded into a PyTorch model compatible with the Faster R-CNN architecture, such as Detectron2.

The contents of uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip should be extracted and can be used with any COCO-compatible object detection framework for training and evaluation purposes.

Code can be found on the related Github repository.

Search
Clear search
Close search
Google apps
Main menu