Annotation Format

An annotation format defines how labeled data is stored and structured in files. When you draw a bounding box around an object in an image, the box coordinates, class label, and image reference need to be saved in a specific format that your training framework can read. Different formats encode this information differently, and converting between them is a routine part of any computer vision workflow.

The most common formats are COCO (JSON with image metadata, category definitions, and annotations including polygons and bounding boxes), YOLO (one text file per image with normalized center-x, center-y, width, height per object), Pascal VOC (XML files with absolute pixel coordinates), and CreateML (Apple's JSON format for on-device models). Each format has trade-offs: COCO supports complex annotations including segmentation masks and keypoints, YOLO is minimal and fast to parse, and VOC is verbose but human-readable.

Most annotation tools can export in multiple formats, and conversion scripts between formats are widely available. When starting a project, choose the format that matches your training framework (Ultralytics expects YOLO format, Detectron2 expects COCO, etc.) to avoid conversion errors that can silently corrupt your labels.

Get Started Now

Get Started using Datature’s computer vision platform now for free.