βRegion-Based CNNs
Region Based Convolutional Neural Network
Last updated
Region Based Convolutional Neural Network
Last updated
It depends on:
Selecting huge number of regions
And then decreasing them to 2000 by selective search
Each region is called a region proposal
Extracting convolutional features from each region
Finally checking if any object exists
An algorithm to to identify different regions, There are basically four regions that form an object: varying scales, colors, textures, and enclosure. Selective search identifies these patterns in the image and based on that, proposes various regions
π In other words: It is an algorithm that depends on computing hierarchical grouping of similar regions and proposes various regions
It takes too many time to be trained.
It can not be impelemented real time.
The selective search algorithm is a fixed algorithm. Therefore, no learning is happening at that stage.
This could lead to the generation of bad candidate region proposals.
R-CNNs are very slow π’ beacause of:
Extracting 2,000 regions for each image based on selective search
Extracting features using CNN for every image region.
If we have N images, then the number of CNN features will be N*2000 π’
Instead of running a CNN 2,000 times per image, we can run it just once per image and get all the regions of interest (regions containing some object).
So, it depends on:
We feed the whole image to the CNN
The CNN generates a feature map
Using the generated feature map we extract ROI (Region of interests)
Problem of 2000 regions is solved π
We are still using selective search π
Then, we resize the regions into a fixed size (using ROI pooling layer)
Finally, we feed regions to fully connected layer (to classify)
Region proposals still bottlenecks in Fast R-CNN algorithm and they affect its performance.
Faster R-CNN fixes the problem of selective search by replacing it with Region Proposal Network (RPN) π€
So, it depends on:
We feed the whole image to the CNN
The CNN generates a feature map
We apply Region proposal network on feature map
The RPN returns the object proposals along with their objectness score
Problem of selective search is solved π
Then, we resize the regions into a fixed size (using ROI pooling layer)
Finally, we feed regions to fully connected layer (to classify)
RPN takes a feature map from CNN
Uses 3*3 window over the map
Generates k anchor boxes
Boxes are in different shapes and sizes
Anchor boxes are fixed sized boundary boxes that are placed throughout the image and have different shapes and sizes. For each anchor, RPN predicts two things:
The probability that an anchor is an object
(it does not consider which class the object belongs to)
The bounding box regressor for adjusting the anchors to better fit the object
Algorithm
Summary
Limitations
π· R-CNN
Extracts around 2000 regions from images using selective search
High computation time
π« Fast R-CNN
Image is passed once to CNN to extract feature maps, regions are extracted by selective search then
Selective search is slow
β° Faster R-CNN
Replaces the selective search method with RPN
slow (?)