Pose Estimation

Pose estimation detects the position and orientation of a person's body (or an object's structure) by locating a set of predefined keypoints in an image or video frame. For human pose estimation, this typically means finding 17-33 body joints (nose, eyes, shoulders, elbows, wrists, hips, knees, ankles) and connecting them to form a skeleton representation.

Two main approaches exist. Top-down methods first detect each person with a bounding box, then run a keypoint estimator on each crop (HRNet, ViTPose, RTMPose). This is more accurate but slower because it processes each person separately. Bottom-up methods detect all keypoints in the image simultaneously, then group them into individual skeletons (OpenPose, HigherHRNet). This is faster for crowded scenes because the computation does not scale with the number of people.

Applications include fitness and sports analytics (tracking form, counting reps, analyzing biomechanics), healthcare and physical therapy (monitoring patient movement and recovery), animation and motion capture (driving 3D character rigs from video), action recognition (classifying activities based on pose sequences), and safety monitoring (detecting falls or unsafe postures in industrial environments). Real-time pose estimation on mobile devices is enabled by lightweight models like MoveNet and MediaPipe Pose.

Resources

Relevant Blog Posts ↘

Glossary

Our Blog

Documentation

How to Perform Action Recognition on Keypoints with ST-GCN++

MIN READ

March 4, 2026

Action recognition is a computer vision task aimed at identifying human actions in visual data, using machine learning techniques to analyze motion and appearance patterns. This field is distinct from traditional classification, focusing on temporal dynamics in videos and has applications in surveillance, healthcare, sports analysis, and more.

Read

Learning ASL with Computer Vision

MIN READ

March 4, 2026

This article shows you how you can build your very own ASL alphabet detection model on the Datature platform to accelerate your learning of ASL in building a more accessible community.

Read

Get Started Now

Get Started using Datature’s computer vision platform now for free.

Book Demo