Feature Pyramid Network (FPN)

A Feature Pyramid Network (FPN) is a multi-scale feature extraction architecture that builds a top-down pathway with lateral connections to produce rich feature maps at every resolution level. The problem it solves: deep neural networks naturally produce features at multiple scales as they downsample through layers, but only the deepest features (smallest spatial resolution) carry strong semantic meaning, while shallow features (high resolution) retain spatial detail but lack semantic context. FPN combines both.

The architecture works in two passes. The bottom-up pass is just the normal forward pass through a backbone like ResNet, producing feature maps at progressively lower resolutions (1/4, 1/8, 1/16, 1/32 of input size). The top-down pass starts from the deepest, most semantic features and upsamples them, merging with the corresponding bottom-up features via lateral (1x1 convolution) connections at each level. The result is a pyramid of feature maps that all carry strong semantics but at different spatial resolutions.

FPN is the default neck in most modern detectors. YOLO uses PANet (a bidirectional version), EfficientDet uses BiFPN (weighted bidirectional), and even transformer-based detectors often use FPN-style multi-scale features. Without multi-scale features, detectors struggle with objects at extreme sizes, especially small ones.

Resources

Relevant Blog Posts ↘

Glossary

Our Blog

Documentation

YOLOv9 - A Comprehensive Guide and Custom Dataset Fine-Tuning

MIN READ

March 4, 2026

YOLOv9 is the latest advancement in the YOLO series for real-time object detection, introducing novel techniques such as Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN) to address information bottlenecks and enhance detection accuracy and efficiency. In this post, we examine some of the key advantages of YOLOv9.

Read

Introducing SAHI and Sliding Window Functions for Small Object Detections

MIN READ

March 4, 2026

Small object detection fails when huge images are resized to fixed inputs, shrinking targets into a few pixels. Sliding-window methods like SAHI slice images into crops (ideally also for training), improving small-object detection.

Read

A Historical Breakdown of YOLO: A Landmark Model in Object Detection

MIN READ

March 7, 2026

YOLO is considered a landmark model in object detection due to its fast and accurate detection results, making it a popular choice for various applications

Read

Get Started Now

Get Started using Datature’s computer vision platform now for free.

Book Demo