Segment Anything Model (SAM)

The Segment Anything Model (SAM) is a foundation model for image segmentation developed by Meta AI, trained on the SA-1B dataset with over 1 billion masks across 11 million images. Released in April 2023, SAM introduced promptable segmentation: given an image and a prompt (a point click, bounding box, text description, or rough mask), it produces a high-quality segmentation mask in real time without any task-specific training.

The architecture has three parts. A heavyweight image encoder (ViT-based) processes the image once and produces feature embeddings. A lightweight prompt encoder converts the user's click, box, or text into a prompt embedding. A fast mask decoder (transformer-based, runs in about 50ms) combines both to predict the final mask. This design means the expensive image encoding runs once, and users can query multiple objects interactively with near-instant response.

SAM 2 (2024) extended the model to video, adding memory-based temporal propagation so you can click on an object in one frame and track its mask across the entire video. SAM2Long improved long-video performance through memory tree search. SAM 3 (2025) advanced boundary quality and multi-granularity outputs. SAM's impact has been substantial: it enables zero-shot segmentation across domains (medical, satellite, industrial) and powers interactive annotation tools, including Datature Nexus's smart segmentation feature for one-click mask generation.

Resources

Relevant Blog Posts ↘

Glossary

Our Blog

Documentation

How to Annotate DICOM Images for Medical Image Segmentation

MIN READ

April 17, 2026

Medical image segmentation projects live or die by annotation tooling, not model architecture. This guide compares desktop tools like 3D Slicer and cloud platforms like Datature across the full DICOM-to-deployed-model pipeline, covering MPR viewing, SAM-assisted labeling, multi-annotator consensus, and 3D segmentation training.

Read

SAM 3: A Technical Deep Dive into Meta's Next-Generation Segmentation Model

MIN READ

March 7, 2026

SAM 3 is Meta’s next-generation segmentation model that shifts from geometric, prompt-based segmentation to concept-level understanding through Promptable Concept Segmentation, enabling open-vocabulary instance detection via text or visual exemplars.

Read

Beyond SAM-2: Exploring Derivatives for Better Performance

MIN READ

March 7, 2026

The Segment Anything Model 2 (SAM-2) transformed video object segmentation with its memory-based architecture for sequential frames. However, it struggles with occlusions and error propagation. Derivative models like SAMURAI and SAM2Long address these issues by integrating advanced memory and motion-aware mechanisms, improving segmentation accuracy and long-term tracking

Read

Get Started Now

Get Started using Datature’s computer vision platform now for free.

Book Demo