CNN’s have been extensively used to classify images. But to detect an object in an image and to draw bounding boxes around them is a tough problem to solve. To solve this problem, R-CNN algorithm was published in 2014. After R-CNN, many of its variants like Fast-R-CNN, Faster-R-CNN and Mask-R-CNN came which improvised the task of object detection. To understand the latest R-CNN variants, it is important to have a clear understanding of R-CNN. Once this is understood, then all other variations can be understood easily.
This post will assume that the reader has familiarity with SVM, image classification using CNNs and linear regression.
The R-CNN paper[1] was published in 2014. It was the first paper to show that CNN can lead to high performance in object detection. This algorithm does object detection in the following way:
Now the post will dive into details explaining how the model is trained and how it predicts the bounding boxes.
#machine-learning #artificial-intelligence #r-cnn #deep-learning