Chat Images To Textual Conversation. Integrating Emoji Detection and Recognition while Recognizing text messages in Chat Images

So I am here again with part two for Chat Images to Textual Conversation. If you have not read about part 1 yet, I definitely recommend reading it on below link:

Problem and Exploration

This blog specifically addresses one main problem that was not addressed in part 1 — “Emoji Detection and Recognition in Chat Images”.

This was just an open ended problem for us, so Akshat Sharma and I tried several different approaches. Apparently it seemed like Object Detection and Recognition problem, so we wanted to detect emojis first and then classify them into one of several categories. We researched several approaches for this type of problem and found that people generally use Convolutional Neural Networks (CNNs). Again in this domain, there was a vast range of choices we could use from. This paper entitled “Object Detection with Deep Learning: A Review” gives a good review of the approaches starting from Region based Convolutional Neural Networks RCNNs to You Look Only Once (YOLO) that can be specifically used for object detection and recognition. Going through several blogs and research results we came to conclusion that Faster-RCNN is the most accurate. While YOLO and Single Shot Detector are fast enough, there are several mentions that they are not good for small object detection as mentioned in these blogs: (12 ) citing several research papers.

