In today’s time, ChatBots have become extremely popular. Highly developed ChatBots like Siri, Cortana, and Alexa have left people surprised with their intelligence and capabilities. A chatbot is actually defined as:

Chatbots can be as simple as rudimentary programs that answer a simple query with a single-line response, or as sophisticated as digital assistants that learn and evolve to deliver increasing levels of personalization as they gather and process information.

In many articles, I have seen a basic comment-response model. My attempt is to see if we can take this a bit further, to modify the user comments to a broader basis and maybe provide the chatbot with other capabilities using a simple web request, automation, and scraping.

Before we start, I just want you to know a few things about, what are we looking at. We will not be looking at a super-smart chatbot like Siri because it will need a huge experience and expertise. Now, if we think, it will be pretty cool, if our chatbot can help us book a hotel, play a song for us, tell us about the weather reports, and so on. We will try to implement all these facilities in our chatbot using just some basic python web handling libraries. We will be using NLTK or Python’s Natural Language Toolkit Library. So, let’s start.

Chatbots are mainly of two types based on their design

  1. Rule-Based
  2. Self-Learned

We are going to use a combined version of the two. In the Rule-Based approach generally, a set of ground rules are set and the chatbot can only operate on those rules in a constrained manner. For the self-learned version, Neural networks are used to train the chatbots to reply to a user, based on some training set of interaction. For the task parts, we will be using a rule-based approach and for the general interactions, we will use a self-learned approach. I found this combined approach much effective than a fully self-learned approach.

Working of NLTK Library

Before we jump into the application, let us look at how NLTK works and how it is used in Natural Language Processing. There are 5 main components of Natural Language Processing. They are:

  • Morphological and Lexical Analysis
  • Syntactic Analysis
  • Semantic Analysis
  • Discourse Integration
  • Pragmatic Analysis

Morphological and Lexical Analysis: Lexical analysis depicts analyzing, identifying, and description of the structure of words. It includes dividing a text into paragraphs, words, and sentences.

Syntactic analysis: The words are commonly accepted as being the smallest units of syntax. The syntax refers to the principles and rules that govern the sentence structure of any individual language. Syntax focus on the proper ordering of words which can affect its meaning.

Semantic Analysis: This component transfers linear sequences of words into structures. It shows how the words are associated with each other. Semantics focuses only on the literal meaning of words, phrases, and sentences.

Discourse Integration: It means a sense of the context. The meaning of any single sentence which depends upon those sentences. It also considers the meaning of the following sentence.

Pragmatic Analysis: Pragmatic Analysis deals with the overall communicative and social content and its effect on interpretation. It means abstracting or deriving the meaningful use of language in situations.

Now, let’s talk about the methods or functions used to implement these five components:

Tokenization: Tokenization is the process by which big quantity of text is divided into smaller parts called tokens. It takes in a sentence and decomposes it into the smallest extractable units or words.

Parts of Speech Tagging: It is a very useful tool in NLP. We know, various parts of speech like Verb, Noun, Adjective, and others. Now, if can tag to which parts of speech do a word belongs to, it will be easy for us to understand the context of a sentence.

Lemmatization: This operation is another very useful tool for NLP. The words which have the same meaning but have some variation according to the context or sentence are brought down to their root word using this operation. This is very important for pattern matching and for rule-based approaches.

All these facilities are provided by the python’s NLTK libraries. Now, let’s check how the Self Learning Algorithm works.

#nlp #chatbots #rule-based #neural-networks #python

Designing A ChatBot  Using Python: A Modified Approach
1.05 GEEK