A loss function measures how far a model's predictions are from the correct answers, producing a single number that training tries to minimize. It's the objective that guides gradient descent: the optimizer computes how changing each weight would affect the loss, then nudges weights to make the loss smaller. Different tasks need different loss functions because they define "correct" in different ways.
For classification, cross-entropy loss compares the model's predicted probability distribution against the true class label. For bounding box regression, IoU-based losses (GIoU, CIoU, DIoU) measure how well the predicted box overlaps with the ground truth. For segmentation, a combination of cross-entropy (per-pixel classification) and dice loss (overlap-based) is standard. Focal loss modifies cross-entropy by down-weighting easy examples and focusing gradient signal on hard ones, which is critical for handling the massive foreground/background imbalance in object detection.
Most modern detectors use a multi-task loss that combines classification loss, box regression loss, and sometimes objectness or centerness loss with learned or fixed weighting between terms. The balance between loss components affects what the model prioritizes: overweighting localization loss produces tighter boxes but may hurt classification accuracy, and vice versa. Loss curve monitoring (plotting training and validation loss over epochs) is the primary diagnostic tool for catching training problems.

