Facebook
TwitterSegment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".
The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.
The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads.
All the features are in the original dataset except image.content (content
of the image).
You can decode segmentation masks with:
import tensorflow_datasets as tfds
pycocotools = tfds.core.lazy_imports.pycocotools
ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
segmentation = example['annotations']['segmentation']
for counts, size in zip(segmentation['counts'], segmentation['size']):
encoded_mask = {'size': size, 'counts': counts}
mask = pycocotools.decode(encoded_mask) # np.array(dtype=uint8) mask
...
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('segment_anything', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterSegmentation models perform a pixel-wise classification by classifying the pixels into different classes. The classified pixels correspond to different objects or regions in the image. These models have a wide variety of use cases across multiple domains. When used with satellite and aerial imagery, these models can help to identify features such as building footprints, roads, water bodies, crop fields, etc.Generally, every segmentation model needs to be trained from scratch using a dataset labeled with the objects of interest. This can be an arduous and time-consuming task. Meta's Segment Anything Model (SAM) is aimed at creating a foundational model that can be used to segment (as the name suggests) anything using zero-shot learning and generalize across domains without additional training. SAM is trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks. This makes the model highly robust in identifying object boundaries and differentiating between various objects across domains, even though it might have never seen them before. Use this model to extract masks of various objects in any image.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using SamLoRA architecture in ArcGIS. Follow the guide and refer to this sample notebook to fine-tune this model.Input8-bit, 3-band imagery.OutputFeature class containing masks of various objects in the image.Applicable geographiesThe model is expected to work globally.Model architectureThis model is based on the open-source Segment Anything Model (SAM) by Meta.Training dataThis model has been trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks.Sample resultsHere are a few results from the model.
Facebook
TwitterThe Segment Anything Model (SAM) has been proven to be a powerful foundation model for image segmentation tasks, which is an important task in computer vision. However, the transfer of its rich semantic information to multiple different downstream tasks remains unexplored. In this paper, we propose the Task-Aware Low-Rank Adaptation (TA-LoRA) method, which enables SAM to work as a foundation model for multi-task learning.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19272950%2F3136c2a234771726dd17c29a758ba365%2Fb0.png?generation=1709156580593893&alt=media" alt="">
Fig. 1: Diagram of the proposed blueberry fruit phenotyping workflow, involving four stages: data collection, dataset generation, model training, and phenotyping traits extraction. Our mobile platform equipped with a multi-view imaging system (top, left and right) was used to scan the blueberry plants through navigating over crop rows. On the basis of fruit/cluster detection dataset, we leverage a maturity classifier and a segmentation foundation model, SAM, to generate a semantic instance dataset for immature, semi-mature, and mature fruits segmentation. We proposed a lightweight improved YOLOv8 model for fruit cluster detection and blueberry segmentation for plant-scale and cluster-scale phenotyping traits extraction, including yield, maturity, cluster number and compactness.
Dataset generation:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19272950%2F7a06785e03056ac75a41f0ba881c7ca2%2Fb1.png?generation=1709156618386382&alt=media" alt="">
Fig 2: Illumination of the proposed automated pixel-wise labels generation for immature, semi-mature, and mature blueberry fruits (genotype: keecrisp). From left to right: (a) bounding box labels of blueberries from our previous manual detection dataset [27]; (b) three-classes boxes labels (immature-yellow, semi-mature-red, mature-blue) re-classified with a maturity classifier; (c) pixel-wise mask labels of blueberry fruits with Segment Anything Model.
If you find this work or code useful, please cite:
@article{li2025-robotic blueberry phenotyping,
title={In-field blueberry fruit phenotyping with a MARS-PhenoBot and customized BerryNet},
author={Li, Zhengkun and Xu, Rui and Li, Changying and Munoz, Patricio and Takeda, Fumiomi and Leme, Bruno},
journal={Computers and Electronics in Agriculture},
volume={232},
pages={110057},
year={2025},
publisher={Elsevier}
}
Facebook
Twitterhttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/L05ULThttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/L05ULT
The CLIP and Segment Anything Model (SAM) are remarkable vision foundation models (VFMs). SAM excels in segmentation tasks across diverse domains, whereas CLIP is renowned for its zero-shot recognition capabilities. This paper presents an in-depth exploration of integrating these two models into a unified framework. Specifically, we introduce the Open-Vocabulary SAM, a SAM-inspired model designed for simultaneous interactive segmentation and recognition, leveraging two unique knowledge transfer modules: SAM2CLIP and CLIP2SAM. The former adapts SAM’s knowledge into the CLIP via distillation and learnable transformer adapters, while the latter transfers CLIP knowledge into SAM, enhancing its recognition capabilities. Extensive experiments on various datasets and detectors show the effectiveness of Open-Vocabulary SAM in both segmentation and recognition tasks, significantly outperforming the naïve baselines of simply combining SAM and CLIP. Furthermore, aided with image classification data training, our method can segment and recognize approximately 22,000 classes.
Facebook
TwitterThis deep learning model is used to detect and segment trees in high resolution drone or aerial imagery. Tree detection can be used for applications such as vegetation management, forestry, urban planning, etc. High resolution aerial and drone imagery can be used for tree detection due to its high spatio-temporal coverage.This deep learning model is based on DeepForest and has been trained on data from the National Ecological Observatory Network (NEON). The model also uses Segment Anything Model (SAM) by Meta.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Fine-tuning the modelThis model cannot be fine-tuned using ArcGIS tools.Input8 bit, 3-band high-resolution (10-25 cm) imagery.OutputFeature class containing separate masks for each tree.Applicable geographiesThe model is expected to work well in the United States.Model architectureThis model is based upon the DeepForest python package which uses the RetinaNet model architecture implemented in torchvision and open-source Segment Anything Model (SAM) by Meta.Accuracy metricsThis model has an precision score of 0.66 and recall of 0.79.Training dataThis model has been trained on NEON Tree Benchmark dataset, provided by the Weecology Lab at the University of Florida. The model also uses Segment Anything Model (SAM) by Meta that is trained on 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks.Sample resultsHere are a few results from the model.CitationsWeinstein, B.G.; Marconi, S.; Bohlman, S.; Zare, A.; White, E. Individual Tree-Crown Detection in RGB Imagery Using Semi-Supervised Deep Learning Neural Networks. Remote Sens. 2019, 11, 1309Geographic Generalization in Airborne RGB Deep Learning Tree Detection Ben Weinstein, Sergio Marconi, Stephanie Bohlman, Alina Zare, Ethan P White bioRxiv 790071; doi: https://doi.org/10.1101/790071
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Introduction This dataset aims to explore the realm of object detection and segmentation with a specific focus on its applications in agriculture. The primary objective is to employ YOLOv8 and SAM techniques to develop robust models for detecting grape bunches.
Dataset Description The dataset comprises four trained models utilizing YOLOv8 architecture. It includes two single-class models, one utilizing object detection and the other employing instance segmentation for grape detection. Additionally, there are two multi-class models capable of predicting and detecting different grape varietals. All models were trained using the large model from the Ultralytics repository (https://github.com/ultralytics/ultralytics).
The dataset encompasses four grape varietals: - Pinot Noir: 102 images and labels - Chardonnay: 39 images and labels from me 47 from thsant - Sauvignon Blanc: 42 images and labels - Pinot Gris: 111 images and labels
Total used for training: 341
Note that the training of the segmentation models used a total of 20 images from each for a total of 100.
Datasets Used for Training To see the dataset (e.g train/test/val folders) used for training the multi class object detection model please see the following zip file and note book:
To build a custom train-dataset please follow the instructions in the notebook: https://www.kaggle.com/code/nicolaasregnier/buildtraindataset/
The labels used for training the multi-class instance segmentation model are under the folder SAMPreds
Data Sources The dataset incorporates two primary data sources. The first source is a collection of images captured using an iPad Air 2 RGB camera. These images possess a resolution of 3226x2449 pixels and an 8-megapixel quality. The second source is contributed by GitHub user thsant, who has created an impressive project available at https://github.com/thsant/wgisd/tree/master.
To label the data, a base model from a previous dataset was utilized, and the annotation process was carried out using LabelImg (https://github.com/heartexlabs/labelImg). It is important to note that some annotations from thsant's dataset required modifications for completeness.
Implementation Steps The data preparation involved the utilization of classes and functions from the "my_SAM" (https://github.com/regs08/my_SAM) and "KaggleUtils" (https://github.com/regs08/KaggleUtils) repositories, facilitating the creation of training sets and the application of SAM techniques.
For model training, the YOLOv8 architecture with default hyperparameters was employed. The object detection models underwent 50 epochs of training, while the instance segmentation models were trained for 75 epochs.
Segment Anything (SAM) from https://segment-anything.com/ was applied to the bbox-labeled data to generate images and corresponding masks for the instance segmentation models. No further editing of the images occurred after applying SAM.
Evaluation and Inference The evaluation metrics utilized were Mean Average Precision (mAP). The following mAP values were obtained:
Single-class object detection: - mAP50: 0.85449 - mAP50-95: 0.56177
Multi-class object detection: - mAP50: 0.85336 - mAP50-95: 0.56316
Single-class instance segmentation: - mAP50: (value not provided) - mAP50-95: (value not provided)
Multi-class instance segmentation: - mAP50: 0.89436 - mAP50-95: 0.62785
For more comprehensive metrics, please refer to the results folder corresponding to the model of interest.
Facebook
TwitterComplex structures can be understood as compositions of smaller, more basic elements. The characterization of these structures requires an analysis of their constituents and their spatial configuration. Examples can be found in systems as diverse as galaxies, alloys, living tissues, cells, and even nanoparticles. In the latter field, the most challenging examples are those of subdivided particles and particle-based materials, due to the close proximity of their constituents. The characterization of such nanostructured materials is typically conducted through the utilization of micrographs. Despite the importance of micrograph analysis, the extraction of quantitative data is often constrained. The presented effort demonstrates the morphological characterization of subdivided particles utilizing a pre-trained artificial intelligence model. The results are validated using three types of nanoparticles: nanospheres, dumbbells, and trimers. The automated segmentation of whole particles, as well as their individual subdivisions, is investigated using the Segment Anything Model, which is based on a pre-trained neural network. The subdivisions of the particles are organized into sets, which presents a novel approach in this field. These sets collate data derived from a large ensemble of specific particle domains indicating to which particle each subdomain belongs. The arrangement of subdivisions into sets to characterize complex nanoparticles expands the information gathered from microscopy analysis. The presented method, which employs a pre-trained deep learning model, outperforms traditional techniques by circumventing systemic errors and human bias. It can effectively automate the analysis of particles, thereby providing more accurate and efficient results.characterization of these structures involves analyzing their constituents and their spatial configuration. Examples are found in systems as diverse as galaxies, alloys, living tissues, cells, down to nanoparticles. In the latter field, subdivided particles and particle-based materials are among the most prominent. Such nanostructured materials are characterized using micrographs. Despite the importance of micrograph analysis, the extraction of quantitative data is often limited. The effort presented here demonstrates the morphological characterization of subdivided particles with a pre-trained artificial intelligence model. This method shows automated segmentation between subdivisions of particles using the Segment Anything Model, which is based on a pre-trained neural network. From this stage on, the subdivisions are organized into sets, which is a novelty in the field. These sets gather data derived from a large ensemble of specific particle domains and contain information to which particle each subdomain belongs. The arrangement of subdivisions into sets to characterize complex nanoparticles expands the information gathered from microscopy analysis. The results gained based on selected model colloids are compared to previously published results, demonstrating that the novel method avoids systemic errors and human bias.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The scarcity of high-quality annotated data poses a significant challenge to the application of deep learning in fabric defect tasks, limiting the generalization and segmentation performance of existing models and impeding their capability to address the complexity of various fabric types and defects. To overcome these obstacles, this study introduces an innovative method to infuse specialized knowledge of fabric defects into the Segment Anything Model (SAM), a large-scale visual model. By introducing and training a unique set of fabric defect-related parameters, this approach seamlessly integrates domain-specific knowledge into SAM without the need for extensive modifications to the preexisting model parameters. The revamped SAM model leverages generalized image understanding learned from large-scale natural image datasets while incorporating fabric defect-specific knowledge, ensuring its proficiency in fabric defect segmentation tasks. The experimental results reveal a significant improvement in the model’s segmentation performance, attributable to this novel amalgamation of generic and fabric-specific knowledge. When benchmarking against popular existing segmentation models across three datasets, our proposed model demonstrates a substantial leap in performance. Its impressive results in cross-dataset comparisons and few-shot learning experiments further demonstrate its potential for practical applications in textile quality control.
Facebook
Twitterhttps://spdx.org/licenses/MIT.htmlhttps://spdx.org/licenses/MIT.html
This dataset contains the necessary code for using our soot (instance) segmentation model used for segmenting soot filaments from PIV (Mie scattering) images. In the corresponding paper, an ablation study is conducted to delineate the effects of domain randomisation parameters of synthetically generated training data on the segmentation accuracy. The best model is used to extract high-level statistics from soot filaments in an RQL-type model combustor to enhance the fundamental understanding soot formation, transport and oxidation. B. Jose, K. P. Geigle, F. Hampp, Domain-Randomised Instance-Segmentation Benchmark for Soot in PIV Images, submitted to Machine Learning: Science and Technology (2025)
Facebook
TwitterATCS is a dataset designed to train deep learning models to volumetrically segment clouds from multi-angle satellite imagery. The dataset consists of spatiotemporally aligned patches of multi-angle polarimetry from the POLDER sensor aboard the PARASOL mission and vertical cloud profiles from the 2B-CLDCLASS product using the cloud profiling radar (CPR) aboard CloudSat.
Facebook
TwitterCoast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Title: Bugzz lightyears: To Semantic Segmentation and Bug-yond!
This dataset comprises a collection of real and robotic toy bugs designed for a small-scale semantic segmentation project. Each bug has been captured six times from various angles, ensuring comprehensive coverage of their features and details. The dataset serves as a valuable resource for exploring semantic segmentation techniques and evaluating machine learning models.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
TomatoMask_SAM
This is a dataset of "in-the-wild" leaf images with segmentation masks generated by the Segment Anything 2 (SAM 2) model.
Dataset Description
This dataset contains multi-leaf, "in-the-wild" images of plants. The segmentation masks were automatically generated using the SAM2AutomaticMaskGenerator and then processed to create a final binary mask for each image, highlighting the most prominent leaf structures. This dataset is intended for training and… See the full description on the dataset page: https://huggingface.co/datasets/LeafNet75/TomatoMask_SAM.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data involved in this paper is from https://www.planet.com/explorer/. The resolution is 3m, and there are 3 main bands, RGB. Since the platform can only download a certain amount of data after applying for an account in the form of education, and at the same time the data is only retained for one month, we chose 8 major cities for the study, 2 images per city. we also provide detailed information on the data visualization and classification results that we have tested and retained in a PPT file called paper, we also provide detailed information on the data visualization and classification results of our tests in a PPT file called paper-result, which can be easily reviewed by reviewers. At the same time, reviewers can also download the data to verify the applicability of the results based on the coordinates of the data sources provided in this paper.The algorithms consist of three main types, one is based on traditional algorithms including object-based and pixel-based, in which we tested the generalization ability of four classifiers, including Random Forest, Support Vector Machine, Maximum Likelihood, and K-mean, in the form of classification in this different way. In addition, we tested two of the more mainstream deep learning classification algorithms, U-net and deeplabV3, both of which can be found and applied in the ArcGIS pro software. The traditional algorithms can be found by checking https://pro.arcgis.com/en/pro-app/latest/help/analysis/image-analyst/the-image-classification-wizard.htm to find the running process, while the related parameter settings and Sample selection rules can be found in detail in the article. Deep learning algorithms can be found at https://pro.arcgis.com/en/pro-app/latest/help/analysis/deep-learning/deep-learning-in-arcgis-pro.htm, and the related parameter settings and sample selection rules can be found in detail in the article. Finally, the big model is based on the SAM model, in which the running process of SAM is from https://github.com/facebookresearch/segment-anything, and you can also use the official Meta segmentation official website to provide a web-based segmentation platform for testing https:// segment-anything.com/. However, the official website has restrictions on the format of the data and the scope of processing.
Facebook
Twitter## Overview
Segmentation Dataset is a dataset for instance segmentation tasks - it contains TrainingData ArVl annotations for 3,312 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
Facebook
TwitterA dog segmentation dataset created manually typically involves the following steps:
Image selection: Selecting a set of images that include dogs in various poses and backgrounds.
Image labeling: Manually labeling the dogs in each image using a labeling tool, where each dog is segmented and assigned a unique label.
Image annotation: Annotating the labeled images with the corresponding segmentation masks, where the dog region is assigned a value of 1 and the background region is assigned a value of 0.
Dataset splitting: Splitting the annotated dataset into training, validation, and test sets.
Dataset format: Saving the annotated dataset in a format suitable for use in machine learning frameworks such as TensorFlow or PyTorch.
Dataset characteristics: The dataset may have varying image sizes and resolutions, different dog breeds, backgrounds, lighting conditions, and other variations that are typical of natural images.
Dataset size: The size of the dataset can vary, but it should be large enough to provide a sufficient amount of training data for deep learning models.
Dataset availability: The dataset may be made publicly available for research and educational purposes.
Overall, a manually created dog segmentation dataset provides a high-quality training data for deep learning models and is essential for developing robust segmentation models.
Facebook
Twitterhttps://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Medical image analysis is critical to biological studies, health research, computer- aided diagnoses, and clinical applications. Recently, deep learning (DL) techniques have achieved remarkable successes in medical image analysis applications. However, these techniques typically require large amounts of annotations to achieve satisfactory performance. Therefore, in this dissertation, we seek to address this critical problem: How can we develop efficient and effective DL algorithms for medical image analysis while reducing annotation efforts? To address this problem, we have outlined two specific aims: (A1) Utilize existing annotations effectively from advanced models; (A2) extract generic knowledge directly from unannotated images.
To achieve the aim (A1): First, we introduce a new data representation called TopoImages, which encodes the local topology of all the image pixels. TopoImages can be complemented with the original images to improve medical image analysis tasks. Second, we propose a new augmentation method, SAMAug-C, that lever- ages the Segment Anything Model (SAM) to augment raw image input and enhance medical image classification. Third, we propose two advanced DL architectures, kCBAC-Net and ConvFormer, to enhance the performance of 2D and 3D medical image segmentation. We also present a gate-regularized network training (GrNT) approach to improve multi-scale fusion in medical image segmentation. To achieve the aim (A2), we propose a novel extension of known Masked Autoencoders (MAEs) for self pre-training, i.e., models pre-trained on the same target dataset, specifically for 3D medical image segmentation.
Scientific visualization is a powerful approach for understanding and analyzing various physical or natural phenomena, such as climate change or chemical reactions. However, the cost of scientific simulations is high when factors like time, ensemble, and multivariate analyses are involved. Additionally, scientists can only afford to sparsely store the simulation outputs (e.g., scalar field data) or visual representations (e.g., streamlines) or visualization images due to limited I/O bandwidths and storage space. Therefore, in this dissertation, we seek to address this critical problem: How can we develop efficient and effective DL algorithms for scientific data generation and compression while reducing simulation and storage costs?
To tackle this problem: First, we propose a DL framework that generates un- steady vector fields data from a set of streamlines. Based on this method, domain scientists only need to store representative streamlines at simulation time and recon- struct vector fields during post-processing. Second, we design a novel DL method that translates scalar fields to vector fields. Using this approach, domain scientists only need to store scalar field data at simulation time and generate vector fields from their scalar field counterparts afterward. Third, we present a new DL approach that compresses a large collection of visualization images generated from time-varying data for communicating volume visualization results.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset contains aerospaces images and their corresponding semantic segmentation masks. The dataset is taken from GitHub repository. It has 3269 images and their corresponding segmentation masks. There are 11 classes in the dataset.
The dataset has no split of train, validation, and test folders; so, for training purposes, it should be split into three sets necessary for Machine Learning and Deep Learning tasks, namely train, validation, and test splits. The structure of the data is as follows:
├── ROOT: └── images: ├── img_file; ├── img_file; ├── ...; └── img_file. └── labels: ├── img_file; ├── img_file; ├── ...; └── img_file.
For the semantic segmentation task, images name and their corresponding labels have the same file name. Good luck!
Facebook
Twitterhttps://spdx.org/licenses/MIT.htmlhttps://spdx.org/licenses/MIT.html
This dataset contains the necessary code for using our spray segmentation model used in the paper, Machine learning based spray process quantification. More information can be found in the README.md.
Facebook
TwitterSegment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".
The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.
The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads.
All the features are in the original dataset except image.content (content
of the image).
You can decode segmentation masks with:
import tensorflow_datasets as tfds
pycocotools = tfds.core.lazy_imports.pycocotools
ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
segmentation = example['annotations']['segmentation']
for counts, size in zip(segmentation['counts'], segmentation['size']):
encoded_mask = {'size': size, 'counts': counts}
mask = pycocotools.decode(encoded_mask) # np.array(dtype=uint8) mask
...
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('segment_anything', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.