Classification
Classification is the task of assigning an input to one of several predefined categories. In image classification, the model looks at an entire image and outputs a single label (or a ranked list of labels with confidence scores). Is this a photo of a cat or a dog? Is this chest X-ray normal or abnormal? Is this product defective or acceptable? These are all classification problems.
Binary classification distinguishes between two classes (defective vs. good, positive vs. negative). Multi-class classification picks one class from many options (classifying an image as one of 1,000 ImageNet categories). Multi-label classification assigns multiple tags to a single image (a photo might be tagged as both "outdoor" and "sunset" and "beach"). Each variant uses slightly different loss functions and output layer configurations.
Modern image classifiers use convolutional neural networks (ResNet, EfficientNet) or vision transformers (ViT, DeiT) as backbones, typically pre-trained on ImageNet and fine-tuned on domain-specific data. Classification accuracy on clean benchmarks is largely solved (top-1 accuracy above 90% on ImageNet), but real-world challenges remain: handling distribution shift, out-of-distribution detection, and maintaining accuracy across rare classes with few training examples.

.jpg)
