Identification of Complaint Relevant Posts on Social Media

Through this blog, I aim to explain the work done in the paper “Semi-Supervised Iterative Approach for Domain-Specific Complaint Detection in Social Media” accepted at the 3rd workshop on e-commerce and NLP, Association for Computational Linguistics 2020. This top-tier venue serves an amalgamation of research, including but not limited to computational linguistics, cognitive modeling, information extraction, and semantics. This work is one of the first attempts to leverage the discursive social media landscape to list complaints and identify grievances. We emphasize the utility of our approach by evaluating it over transport-related services on social media platform Twitter. The post will record a brief overview of the motivation, methodology, and applications of this research. More technical details can be found in the paper. Our team would eagerly look forward to any suggestions and improvements regarding this work.

Motivation

Social media has lately become one of the primary venues where users express their opinions about various products and services. These opinions are extremely useful in understanding the user’s perceptions and sentiment about these services. They are also valuable for identifying potential defects and critical to the execution of downstream customer service responses. Public sector firms like transport and logistics are strongly affected by public opinions and form a critical aspect of a country’s economy. Often, businesses rely on social media to ascertain customer feedback and to initial response. Therefore, automatic detection of user complaints on social media could prove beneficial to both the clients and the service providers.

Image for post

A social media user tagging relevant authorities with their grievances.

Source: https://money.cnn.com/data/sectors/transportation/?sector=4600&industry=4610

Transportation-related companies having significant market shares.

Traditionally, listing complaints involves social media users tagging the relevant individuals with their complaints. However, there are a certain set of drawbacks that reduces the utility of this approach.

The prevalence of such posts is low as compared to others where concerned authorities are tagged. Additionally, media platforms are plagued with redundancy, where the posts are rephrased or structurally morphed before being re-posted. Also, vast amounts of inevitable noise make it hard to identify posts that may require immediate attention.

Our Contribution

To build such detection systems, we could employ supervised approaches that would typically require a large corpus of labeled training samples. However, as discussed, labeling social media posts that capture complaints about a particular service is challenging. Prior work in event detection has demonstrated that simple linguistic indicators (phrases or n-grams) can be useful in the accurate discovery of events in social media. Though user complaints are not the same as events, more of a speech act, we posit that similar indicators can be used in complaint detection. To pursue this hypothesis, we propose a semi-supervised iterative approach to identify social media posts that complain about a specific service.

In our experimental work, we started with an annotated set of 326 samples of transportation complaints, and after four iterations of the approach, we collected 2,840 indicators and over 3,700 tweets. We annotated a random sample of 700 tweets from the final dataset and observed that over 47% of the samples were actual transportation complaints. We also characterize the performance of basic classification algorithms on this dataset. In doing so, we also study how different linguistic features contribute to the performance of a supervised model in this domain.

#artificial-intelligence #nlp #computational-linguistics #social-computing

Motivation

Our Contribution

towardsdatascience.com

Identification of Complaint Relevant Posts on Social Media