MobileNets -

Visual reognition for on device and embedded applications poses mnay challenges:

model must run quickly with high accuracy
resource-constrained environment

THis paper describes an efficient network and a set of two hyper-parameters in order to build very samll, low latency models.

previous work:

Depthwais seperable convolutions
Flattened networks.
Factorized networks
Xception network
Squeezenet
different approach: Shrinking
Another method for training small network:
- distillation: which use a larger network to teach a small network, this is complementary to mobilenet and is covered in section 4.
  mobilenet architecture

seperable convolution :https://towardsdatascience.com/a-basic-introduction-to-separable-convolutions-b99ec3102728

why does spatial seperable convolution reduce multiplication?

traditional convolution:

input: $D_F * D_F * M$ output: $D_G * D_G N$ kernel:$D_kD_kMN$ computational cost: $D_KD_KMND_F*D_F$

mobilenet </a>

width multiplier:

$\alpha$ channels: $\alpha M$, $\alpha N$

Resolution multiplier:

MObileNet_v2: inverted Residuals and Linear bottleneck.

inverted residuals:

original residual blocks: wide-narrow-wide
inverted: narrow - wide- narrow

### linear bottleneck

remove the last activation layer before the last convolutional layer.

### relu 6 instead of relu

## MobileNet V3

### Squeeze-and-Excitation networks https://towardsdatascience.com/squeeze-and-excitation-networks-9ef5e71eacd7 ### Platform-Aware NAS for Block-wise Search.

### hard-swish activation layers.

MobileNets

Mobilenet network