In this blog post, I have explained the architectural details about the “Rich feature hierarchies for accurate object detection and semantic segmentation” paper. Though this paper has been there for quite a while, there are still a lot of things to learn from the paper apart from the architecture. I have started with a brief overview of the OverFeat network and then proceeded with the RNN network. If you are unaware of the OverFeat network, then don’t worry !! you still won’t miss anything.

Also, the structure of the blog is a bit different. It’s more like a conversation between student and teacher (Trying to learn with the Feynman technique. The questions from the student side are highlighted in bold (just in case you are in a rush).


I am well versed with the famous Object Classification algorithms like VGG, AlexNet, ResNet, InceptionNet, MobileNet, (and all their variants) to name some. I was amazed by the architecture of such methods and would like to expand my knowledge in this domain further. However, I had one question in my mind, that these models are only able to tell whether an image contains an object or no. However, I would like to work on models that could also tell where that object is, in an image?

#computer-vision #r-cnn #deep-learning #artificial-intelligence #object-detection

Understanding Regions with CNN features (R-CNN)
5.50 GEEK