Abstract
- Detection performance tends to degrade with increasing IoU threshold
- Overfitting during training, due to exponentially vanishing positive samples
- Inference-time mismatch between the IoUs for which the detector is optimal and those of the input hypotheses
Introduction


- A detector optimized at a single IoU level is not necessarily optimal at other levels
- Higher quality detection requires a closer quality match between detector and the hypotheses it processes
- A detector can only have high quality if presented with high quality proposals.
- 그럼 그냥 IoU threshold 를 늘리면 성능이 증가 할까? → No
- Distribution of hypotheses out of a proposal detector is usually heavily imbalanced towards low quality → high IoU thresholds lead to smaller number of positive training samples → Prone to Overfitting
- Detection can be suboptimal when they are asked to work on the hypotheses of other quality levels

- Output IoU of a regressor is almost invariably better than the input IoU
- The output of a detector trained with a certain IoU threshold is a good distribution to train the detector of the next higher IoU threshold
Propose Cascade R-CNN, a multi-stage extension of R-CNN, where detector stages deeper into the cascade are sequentially more selective against close false positives.
Related Works
Iterative BBox at inference



- Single regression step is insufficient for accurate localization → apply BBox regression repeatedly
- Fixed IoU Threshold $u$ = 0.5 → degrades at high IoU BBoxes

- Distribution of the bonding box changes significantly after each iteration.
Integral Loss
- difficult to ask single classifier to perform uniformly well over all IoU levels.
- Ensemble?



- Prone to overfitting for high quality classifiers
- High quality classifiers are required to process low quality proposals at inference