Facebook
TwitterSegment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".
The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.
The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads.
All the features are in the original dataset except image.content (content
of the image).
You can decode segmentation masks with:
import tensorflow_datasets as tfds
pycocotools = tfds.core.lazy_imports.pycocotools
ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
segmentation = example['annotations']['segmentation']
for counts, size in zip(segmentation['counts'], segmentation['size']):
encoded_mask = {'size': size, 'counts': counts}
mask = pycocotools.decode(encoded_mask) # np.array(dtype=uint8) mask
...
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('segment_anything', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterSegmentation models perform a pixel-wise classification by classifying the pixels into different classes. The classified pixels correspond to different objects or regions in the image. These models have a wide variety of use cases across multiple domains. When used with satellite and aerial imagery, these models can help to identify features such as building footprints, roads, water bodies, crop fields, etc.Generally, every segmentation model needs to be trained from scratch using a dataset labeled with the objects of interest. This can be an arduous and time-consuming task. Meta's Segment Anything Model (SAM) is aimed at creating a foundational model that can be used to segment (as the name suggests) anything using zero-shot learning and generalize across domains without additional training. SAM is trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks. This makes the model highly robust in identifying object boundaries and differentiating between various objects across domains, even though it might have never seen them before. Use this model to extract masks of various objects in any image.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using SamLoRA architecture in ArcGIS. Follow the guide and refer to this sample notebook to fine-tune this model.Input8-bit, 3-band imagery.OutputFeature class containing masks of various objects in the image.Applicable geographiesThe model is expected to work globally.Model architectureThis model is based on the open-source Segment Anything Model (SAM) by Meta.Training dataThis model has been trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks.Sample resultsHere are a few results from the model.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Jinttt
Released under Apache 2.0
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Segment Anything Model (SAM) with Prioritized Memory Overview The Segment Anything Model (SAM) by Meta is a state-of-the-art image segmentation model leveraging vision transformers. However, it suffers from high memory usage and computational inefficiencies. Our research introduces a prioritized memory mechanism to enhance SAM’s performance while optimizing resource consumption. Methodology We propose a structured memory hierarchy to efficiently manage image embeddings and self-attention… See the full description on the dataset page: https://huggingface.co/datasets/vinit000/Enhancing-Segment-Anything-Model-with-Prioritized-Memory-For-Efficient-Image-Embeddings.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Segment Anything Model (SAM) and its successor, SAM2, are highly influential foundation models in the field of computer vision, specifically designed for promptable image segmentation.
In the context of the Kaggle competition, the use of SAM/SAM2 is for Candidate Generation:
Facebook
TwitterThe Segment Anything Model (SAM) has been proven to be a powerful foundation model for image segmentation tasks, which is an important task in computer vision. However, the transfer of its rich semantic information to multiple different downstream tasks remains unexplored. In this paper, we propose the Task-Aware Low-Rank Adaptation (TA-LoRA) method, which enables SAM to work as a foundation model for multi-task learning.
Facebook
TwitterThis work investigates the robustness of SAM to corruptions and adversarial attacks.
Facebook
TwitterThree publicly available medical imaging datasets: Breast Ultrasound Scan Images (BUSI), CVC-ClinicDB, and ISIC-2016.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Mr. Ri and Ms. Tique
Released under CC0: Public Domain
Facebook
Twitterhttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.21979/N9/KF8798https://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.21979/N9/KF8798
We present EdgeSAM, an accelerated variant of the Segment Anything Model (SAM), optimized for efficient execution on edge devices with minimal compromise in performance. Our approach involves distilling the original ViT-based SAM image encoder into a purely CNN-based architecture, better suited for edge devices. We carefully benchmark various distillation strategies and demonstrate that taskagnostic encoder distillation fails to capture the full knowledge embodied in SAM. To overcome this bottleneck, we include both the prompt encoder and mask decoder in the distillation process, with box and point prompts in the loop, so that the distilled model can accurately capture the intricate dynamics between user input and mask generation. To mitigate dataset bias issues stemming from point prompt distillation, we incorporate a lightweight module within the encoder. As a result, EdgeSAM achieves a 37-fold speed increase compared to the original SAM, and it also outperforms MobileSAM/EfficientSAM, being over 7 times as fast when deployed on edge devices while enhancing the mIoUs on COCO and LVIS by 2.3/1.5 and 3.1/1.6, respectively. It is also the first SAM variant that can run at over 30 FPS on an iPhone 14.
Facebook
TwitterThis paper tackles a novel problem: how to transfer knowledge from the emerging Segment Anything Model (SAM) to learn a compact panoramic semantic segmentation model, i.e., student, without requiring any labeled data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Each folder in 'Prompting data.zip' corresponds to a single category (Bird, Cat, Bus etc), and each of these contain folders corresponding to a single participant (st1, st2 etc). Each participant folder should contain 5 subfolders:
Quick usage:
-To get the best (highes score) mask for a given image : masks[sorts[0]]
-To get the best set of prompts for that image : green[sorts[0]] and red[sorts[0]]
-To get which round produced the highest score in that image : eachround[sorts[0]]
The codebase associated with this work can be found at this Github.
Please refer to our lab-wide github for more information regarding the code associated with our other papers.
Facebook
TwitterThe dataset used in this study for evaluating the performance of the Segment Anything Model (SAM) in clinical radiotherapy.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description This dataset is a comprehensive collection of Brain Magnetic Resonance Imaging (MRI) scans, meticulously annotated with the Segment Anything Model (SAM). The data is stored in a CSV file format for easy access and manipulation.
Content The dataset contains MRI scans of the brain, each of which is annotated with SAM. The annotations provide detailed information about the segmentation of various structures present in brain scans. The dataset is designed to aid in developing and validating algorithms for automatic brain structure segmentation.
Facebook
Twittersegment-anything-2-main.zip is from https://github.com/facebookresearch/segment-anything-2
Facebook
Twitterhttps://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
This study introduces the concept of "structural beauty" as an objective computational approach for evaluating the aesthetic appeal of images. Through the utilization of the Segment anything model (SAM), we propose a method that leverages recursive segmentation to extract finer-grained substructures. Additionally, by reconstructing the hierarchical structure, we obtain a more accurate representation of substructure quantity and hierarchy. This approach reproduces and extends our previous research, allowing for the simultaneous assessment of Livingness in full-color images without the need for grayscale conversion or separate computations for foreground and background Livingness. Furthermore, the application of our method to the Scenic or Not dataset, a repository of subjective scenic ratings, demonstrates a high degree of consistency with subjective ratings in the 0-6 score range. This underscores that structural beauty is not solely a subjective perception, but a quantifiable attribute accessible through objective computation. Through our case studies, we have arrived at three significant conclusions. 1) our method demonstrates the capability to accurately segment meaningful objects, including trees, buildings, and windows, as well as abstract substructures within paintings. 2) we observed that the clarity of an image impacts our computational results; clearer images tend to yield higher Livingness scores. However, for equally blurry images, Livingness does not exhibit a significant reduction, aligning with human visual perception. 3) our approach fundamentally differs from methods employing Convolutional Neural Networks (CNNs) for predicting image scores. Our method not only provides computational results but also offers transparency and interpretability, positioning it as a novel avenue in the realm of Explainable AI (XAI).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation of the the Segment Anything Model (SAM) for penguin colony segmentation using mean intersection over union (mIoU), difference in perimeter to area ratio (PAR), area error, and accuracy (i.e. panels a-c in Figs 3 and 4 vs. ground truth). 95% confidence intervals are shown. An up (down) arrow indicates a measure where a larger (smaller) number is preferred.
Facebook
TwitterAdvanced by transformer architecture, vision foundation models (VFMs) achieve remarkable progress in performance and generalization ability. Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation. However, most VFMs cannot run in real-time, which makes it difficult to transfer them into several products.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Auto-annotate images with GroundingDINO and SAM models Auto-annotate images using a text prompt. GroundingDINO is employed for object detection (bounding boxes), followed by MobileSAM or SAM for segmentation. The annotations are then saved in both Pascal VOC format and COCO format....
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Title: Bugzz lightyears: To Semantic Segmentation and Bug-yond!
This dataset comprises a collection of real and robotic toy bugs designed for a small-scale semantic segmentation project. Each bug has been captured six times from various angles, ensuring comprehensive coverage of their features and details. The dataset serves as a valuable resource for exploring semantic segmentation techniques and evaluating machine learning models.
Facebook
TwitterSegment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".
The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.
The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads.
All the features are in the original dataset except image.content (content
of the image).
You can decode segmentation masks with:
import tensorflow_datasets as tfds
pycocotools = tfds.core.lazy_imports.pycocotools
ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
segmentation = example['annotations']['segmentation']
for counts, size in zip(segmentation['counts'], segmentation['size']):
encoded_mask = {'size': size, 'counts': counts}
mask = pycocotools.decode(encoded_mask) # np.array(dtype=uint8) mask
...
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('segment_anything', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.