Ground Truth

Ground truth is the set of correct, human-verified annotations that a machine learning model is trained and evaluated against. In object detection, ground truth consists of bounding boxes with class labels drawn around every object of interest in each image. In segmentation, it's pixel-level masks. In classification, it's the correct class tag. The model's job is to produce predictions that match the ground truth as closely as possible.

Ground truth quality sets the ceiling for model performance. If the bounding boxes are sloppy (not tight around objects), the model learns sloppy localization. If annotators disagree on ambiguous cases (is that shadow a crack or not?), the model gets conflicting training signals. If classes are mislabeled, the model learns wrong associations. This is why annotation guidelines, quality control workflows, and inter-annotator agreement checks matter as much as the model architecture itself.

In evaluation, ground truth serves as the reference for computing all metrics: IoU between predicted and ground truth boxes determines true/false positives, which flow into precision, recall, mAP, and F1 calculations. Some datasets have known ground truth errors (COCO has approximately 1-2% label noise), so perfect scores are neither expected nor necessarily desirable.

Resources

Relevant Blog Posts ↘

Glossary

Our Blog

Documentation

How to Improve Data Annotation Quality with Consensus Algorithm

MIN READ

March 4, 2026

We are excited to introduce the Consensus Tool to encourage annotation collaboration and improve quality controls through quality assessment of labellers and their annotations.

Read

Introducing Our External Labelling Service to Accelerate Your Ground Truth

MIN READ

March 4, 2026

Our External Labelling Service provides flexible, on-demand quality annotation services that can be tailored to the specific requirements of the user.

Read

How to Visually Inspect Your Dataset and Annotations for Model Training

MIN READ

March 4, 2026

Gain greater insight into your dataset's quality with Aggregation Statistics which will help you improve your dataset for more effective ML model training.

Read

Get Started Now

Get Started using Datature’s computer vision platform now for free.

Book Demo