Mask R-CNN

Mask R-CNN is a two-stage instance segmentation architecture that extends Faster R-CNN by adding a parallel mask prediction branch alongside the existing bounding box and classification heads. Published by He et al. (Facebook AI Research) in 2017, it was the first architecture to cleanly separate instance segmentation into detection (which objects, where) and mask prediction (which pixels belong to each detected object).

The pipeline works in two stages. First, a Region Proposal Network (RPN) generates candidate object regions from backbone features (typically ResNet + FPN). Second, for each proposal, three parallel heads predict: the class label, refined bounding box coordinates, and a binary pixel mask within the box. A key contribution was RoIAlign, which replaced the coarser RoIPool with bilinear interpolation to preserve exact spatial alignment between feature maps and proposals, improving mask boundary quality significantly.

Mask R-CNN became the baseline for instance segmentation research and remains widely used in production systems. Its modular design makes it easy to swap backbones, add keypoint heads (for pose estimation), or attach panoptic segmentation branches. While newer architectures like Mask2Former and SAM have pushed accuracy higher, Mask R-CNN's straightforward training and well-understood behavior make it a reliable choice for custom datasets. Datature Nexus supports Mask R-CNN for instance segmentation training.

Resources

Relevant Blog Posts ↘

Glossary

Our Blog

Documentation

An Introduction to Bitmask Representations and Encodings - RLE vs REE

MIN READ

March 4, 2026

This article discusses the challenges of image segmentation and compares dense and sparse bitmask formats. It introduces Run-Length Encoding (RLE) and Run-End Encoding (REE) as efficient solutions for storing segmentation masks. REE improves space efficiency and speed by enabling faster pixel lookup and Boolean operations. Binary tree compression is explored to further optimize REE for large-scale tasks.

Read

Training an Instance Segmentation Model with Custom Data

MIN READ

March 4, 2026

Learn how to train an instance segmentation model in 5 easy steps. Try out new network architectures and augmentation techniques quickly and easily. Read more!

Read

Get Started Now

Get Started using Datature’s computer vision platform now for free.

Book Demo