TL;DR Understanding spam or ham classifier from the aspect of Artificial Intelligence concepts, work with various classification algorithms, and select high accuracy producing algorithm and develop the Python Flask App for SMS: spam or ham detector.

Short Message Services (SMS) is far more than just a technology for a chat. SMS technology evolved out of the global system for mobile communications standard, an internationally accepted[1]. Spam is the abuse of electronic messaging systems to send unsolicited messages in bulk indiscriminately [2]. While the most widely recognized form of spam is email spam, the term is applied to similar abuses in other media and mediums. SMS Spam in the context is very similar to email spams, typically, unsolicited bulk messaging with some business interest. SMS spam is used for commercial advertising and spreading phishing links. Commercial spammers use malware to send SMS spam because sending SMS spam is illegal in most countries. Sending spam from a compromised machine reduces the risk to the spammer because it obscures the provenance of the spam. SMS can have a limited number of characters, which includes alphabets, numbers, and a few symbols. A look through the messages shows a clear pattern. Almost all of the spam messages ask the users to call a number, reply by SMS, or visit some URL. This pattern is observable by the results obtained by a simple SQL query on the spam corpus[3]. The low price and the high bandwidth of the SMS network have attracted a large amount of SMS spam [4].

People classify SMS Spam as annoying (32.3%), wasting time(24.8%), and violating personal privacy (21.3%)[5].

Every time SMS spam arrives at a user’s inbox, and the mobile phone alerts the user to the incoming message. When the user realizes that the message is unwanted, he or she will be disappointed, and also SMS spam takes up some of the mobile phone’s storage.

SMS spam detection is an important task where spam SMS messages are identified and filtered. As more significant numbers of SMS messages are communicated every day, it is challenging for a user to remember and correlate the newer SMS messages received in context to previously received SMS. Thus, using the knowledge of artificial intelligence with the amalgamation of machine learning, and data mining we will try to develop web-based SMS text spam or ham detector.

This is three parts of blog series, where we will understand the in and out of spam or ham classifier from the aspect of Artificial Intelligence concepts, and work with various classification algorithms in jupyter notebook and select the one algorithm based on performance criteria. Then, we will develop the Python web-based SMS text spam or ham detector.

What will we cover here

  • Theoretical AI Concept Regarding Spam or Ham Classifier
  • Classification Algorithms
  • Exploring Data Source
  • Data Preparation
  • Exploratory Data Analysis
  • Naïve Bayes Behind Spam or Ham
  • Performance Measurement Criterion
  • Development of Spam or Ham Detector

#spam #sms #data-science #python #artificial-intelligence

The Ultimate Guide To SMS: Spam or Ham Detector
1.40 GEEK