The Magic of Reactive Supervision. High Quality NLP Data Labeling Using Social Media Interactions

This story is based on the paper “Reactive Supervision: A New Method for Collecting Sarcasm Data” by Shmueli et al., to appear in EMNLP 2020.

Machine learning models are as good as the quality of their training data. Noisy, inaccurate labels can lead to disastrous predictions. But getting sufficient data with high quality labels is often the most challenging part of the machine learning pipeline. For NLP tasks in particular, getting labeled or annotated texts can be a messy and expensive ordeal.

For example, say you need labeled data to train a classifier for sarcasm detection. Sarcasm is a form of insincere speech, somewhat akin to lying. But there’s a difference between lies and sarcastic utterances: when you lie, you try _not _to get caught. When you‘re being sarcastic, you actually want your audience to “get it”. For example, you might tell your friend “Oh yes, statistics is so much fun…”, but you do it in a snarky voice because you want your friend to know that you actually hate statistics.

This leads us to why automatic sarcasm detection for text is so important. If an NLP system doesn’t detect the sarcasm in a sentence, the meaning of the text is completely flipped. Imagine that you build a sentiment classifier. If your classifier misses the sarcasm in, say, a restaurant review (“The soup was cold, the desert was stale. BEST MEAL EVER!”), it might misclassify the post’s sentiment as “positive”. Or think about AI chatbots — misunderstanding the user’s sarcasm can be catastrophic.

