Advanced Concepts About Convolutional Neural Networks

Term | Description |

π· Padding | Adding additional border(s) to the image before convolution |

π Strided Convolution | Convolving by |

π Convolutions Over Volume | Applying convs on n-dimensional input (such as an RGB image) |

Adding an additional one border *or more* to the image so the image is `n+2 x n+2`

and after convolution we end up with `n x n`

image which is the original size of the image

`p`

= number of added borders

For convention: it is filled by 0

For better understanding let's say that we have two concepts:

It means no padding so:

`n x n`

* `f x f`

β‘ `n-f+1 x n-f+1`

Pad so that output size is the **same** as the input size.

So we want that π§:

`n+2p-f+1`

= `n`

Hence:

`p`

= `(f-1)/2`

For convention f is chosen to be odd π©βπ

Another approach of convolutions, we calculate the output by applying filter on regions by some value `s`

.

For an `n x n`

image and `f x f`

filter, with `p`

padding and stride `s`

; the output image size can be calculated by the following formula

β$\left \lfloor{\frac{n+2p-f}{s}+1}\right \rfloor \times \left \lfloor{\frac{n+2p-f}{s}+1}\right \rfloor$β

To apply convolution operation on an RGB image; for example on 10x10 px RGB image, technically the image's dimension is 10x10x3 so we can apply for example a 3x3x3 filter *or fxfx3* π€³

Filters can be applied on a special color channel π¨

Layer | Description |

π« Convolution | Filters to extract features |

π Pooling | A technique to reduce size of representation and to speed up the computations |

β Fully Connected | Standard single neural network layer (one dimensional) |

π©βπ« Usually when people report number of layers in an NN they just report the number of layers that have weights and params

Convention:

`CONV1`

+`POOL1`

=`LAYER1`

Better performance since they decrease the parameters that will be tuned π«