paper: 2018: https://arxiv.org/abs/1805.05563
2019 RetinaFace: Single-stage Dense Face Localisation in the Wild.
2017 RetinaNet:
https://towardsdatascience.com/review-retinanet-focal-loss-object-detection-38fba6afabe4
1.3. Number of Boxes Comparison
- YOLOv1: 98 boxes
- YOLOv2: ~1k
- OverFeat: ~1–2k
- SSD: ~8–26k
- RetinaNet: ~100k. RetinaNet can have ~100k boxes with the resolve of class imbalance problem using focal loss.
Cross Entropy: $−(ylog(p)+(1−y)log(1−p))$ for m=2
α-Balanced CE Loss
Focal Loss (FL)
α-Balanced Variant of FL
Model Initialization(???)
- A prior π is set for the value of p at the start of training, so that the model’s estimated p for examples of the rare class is low, e.g. 0.01, in order to improve the training stability in the case of heavy class imbalance.
- It is found that training RetinaNet uses standard CE loss WITHOUT using prior π for initialization leads to network divergence during training and eventually failed.
- And results are insensitive to the exact value of π. And π = 0.01 is used for all experiments.
RetinaNet Detector Arch
# Deformable Convolutional Networks.
# Region of Interest Pooling