STN
https://arxiv.org/abs/1506.02025 from 2015.
STN helps to crop out and scale-normalizes the appropriate region, which can simplify the subsequent classfication task and lead to better classification performance.


Quick Review on Spatial Transformation Matrices
There are mainly 3 transformation learnt by STN in the paper. Indeed, more sophisticated transformation can also be applied as well.
1.1 Affine Transformation
1.2 Projective Transformation
1.3 Thin Plate Spline(TPS) Transformation
To be explored
Spatial Transformer Network(STN)
STN = Localisation Net + Grid Generator + Sampler
2.1 Localisation Net
input feature map: (W,H,C)
output: $\theta$ , parameters of transformation $T\theta$
Grid Generator
Sampler
Sampling Kernel
DCN
- Regular convolution is operated on a regular grid R.
- Deformable convolution is operated on R but with each points augmented by a learnable offset ∆pn.
- Convolution is used to generate 2N number of feature maps corresponding to N 2D offsets ∆pn (x-direction and y-direction for each offset).