Introduction Natural Language Processing (NLP) in Python

Introduction Natural Language Processing (NLP) in Python

Natural Language Processing (NLP) is a branch of Artificial Intelligence that deals with the interaction between computers and humans using natural language. NLP is used to read, decipher, understand, and make sense of the human languages in a manner that is valuable.

Natural Language Processing (NLP) is a branch of Artificial Intelligence that deals with the interaction between computers and humans using natural language.
NLP is used to read, decipher, understand, and make sense of the human languages in a manner that is valuable.

Top Python Development Companies | Hire Python Developers

Top Python Development Companies | Hire Python Developers

After analyzing clients and market requirements, TopDevelopers has come up with the list of the best Python service providers. These top-rated Python developers are widely appreciated for their professionalism in handling diverse projects. When...

After analyzing clients and market requirements, TopDevelopers has come up with the list of the best Python service providers. These top-rated Python developers are widely appreciated for their professionalism in handling diverse projects. When you look for the developer in hurry you may forget to take note of review and ratings of the company's aspects, but we at TopDevelopers have done a clear analysis of these top reviewed Python development companies listed here and have picked the best ones for you.

List of Best Python Web Development Companies & Expert Python Programmers.

Python - NLP Analysis of Restaurant Reviews

Python - NLP Analysis of Restaurant Reviews

Python | NLP analysis of Restaurant reviews. Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.

Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data. It is the branch of machine learning which is about analyzing any text and handling predictive analysis.

Scikit-learn is a free software machine learning library for Python programming language. Scikit-learn is largely written in Python, with some core algorithms written in Cython to achieve performance. Cython is a superset of the Python programming language, designed to give C-like performance with code that is written mostly in Python.

For my project, I looked into Yelp reviews and use Natural Language Processing (NLP) to extract more meaning out of them.
Project Abstract

Let’s understand the various steps involved in text processing and the flow of NLP.

This algorithm can be easily applied to any other kind of text like classify book into like Romance, Friction, but for now, let’s use a restaurant review dataset to review negative or positive feedback.

Steps involved:

Step 1: Import dataset with setting delimiter as ‘\t’ as columns are separated as tab space. Reviews and their category(0 or 1) are not separated by any other symbol but with tab space as most of the other symbols are is the review (like $ for price, ….!, etc) and the algorithm might use them as delimiter, which will lead to strange behavior (like errors, weird output) in output.

# Importing Libraries 
import numpy as np 
import pandas as pd 

# Import dataset 
dataset = pd.read_csv('Restaurant_Reviews.tsv', delimiter = '\t') 

To download the Restaurant_Reviews.tsv dataset used, click here.

Step 2: Text Cleaning or Preprocessing

  • Remove Punctuations, Numbers: Punctuations, Numbers doesn’t help much in processong the given text, if included, they will just increase the size of bag of words that we will create as last step and decrase the efficency of algorithm.
  • Stemming: Take roots of the word
  • Convert each word into its lower case: For example, it useless to have same words in different cases (eg ‘good’ and ‘GOOD’).
# library to clean data 
import re 

# Natural Language Tool Kit 
import nltk 

nltk.download('stopwords') 

# to remove stopword 
from nltk.corpus import stopwords 

# for Stemming propose 
from nltk.stem.porter import PorterStemmer 

# Initialize empty array 
# to append clean text 
corpus = [] 

# 1000 (reviews) rows to clean 
for i in range(0, 1000): 
	
	# column : "Review", row ith 
	review = re.sub('[^a-zA-Z]', ' ', dataset['Review'][i]) 
	
	# convert all cases to lower cases 
	review = review.lower() 
	
	# split to array(default delimiter is " ") 
	review = review.split() 
	
	# creating PorterStemmer object to 
	# take main stem of each word 
	ps = PorterStemmer() 
	
	# loop for stemming each word 
	# in string array at ith row	 
	review = [ps.stem(word) for word in review 
				if not word in set(stopwords.words('english'))] 
				
	# rejoin all string array elements 
	# to create back into a string 
	review = ' '.join(review) 
	
	# append each string to create 
	# array of clean text 
	corpus.append(review) 

Examples: Before and after applying above code (reviews = > before, corpus => after)

Step 3: Tokenization, involves splitting sentences and words from the body of the text.

Step 4: Making the bag of words via sparse matrix

  • Take all the different words of reviews in the dataset without repeating of words.
  • One column for each word, therefore there are going to be many columns.
  • Rows are reviews
  • If word is there in row of dataset of reviews, then the count of word will be there in row of bag of words under the column of the word.

Examples: Let’s take a dataset of reviews of only two reviews

Input : "dam good steak", "good food good servic"
Output :
 ![](https://media.geeksforgeeks.org/wp-content/uploads/eg.png)

For this purpose we need CountVectorizer class from sklearn.feature_extraction.text.
We can also set a max number of features (max no. features which help the most via attribute “max_features”). Do the training on the corpus and then apply the same transformation to the corpus “.fit_transform(corpus)” and then convert it into an array. If review is positive or negative that answer is in the second column of the dataset[:, 1] : all rows and 1st column (indexing from zero).

# Creating the Bag of Words model 
from sklearn.feature_extraction.text import CountVectorizer 

# To extract max 1500 feature. 
# "max_features" is attribute to 
# experiment with to get better results 
cv = CountVectorizer(max_features = 1500) 

# X contains corpus (dependent variable) 
X = cv.fit_transform(corpus).toarray() 

# y contains answers if review 
# is positive or negative 
y = dataset.iloc[:, 1].values 


Description of the dataset to be used:

  • Columns seperated by \t (tab space)
  • First column is about reviews of people
  • In second column, 0 is for negative review and 1 is for positive review

Step 5 : Splitting Corpus into Training and Test set. For this, we need class train_test_split from sklearn.cross_validation. Split can be made 70/30 or 80/20 or 85/15 or 75/25, here I choose 75/25 via “test_size”.
X is the bag of words, y is 0 or 1 (positive or negative).

# Splitting the dataset into 
# the Training set and Test set 
from sklearn.cross_validation import train_test_split 

# experiment with "test_size" 
# to get better results 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25) 

Step 6: Fitting a Predictive Model (here random forest)

  • Since Random fored is ensemble model (made of many trees) from sklearn.ensemble, import RandomForestClassifier class
  • With 501 tree or “n_estimators” and criterion as ‘entropy’
  • Fit the model via .fit() method with attributes X_train and y_train
# Fitting Random Forest Classification 
# to the Training set 
from sklearn.ensemble import RandomForestClassifier 

# n_estimators can be said as number of 
# trees, experiment with n_estimators 
# to get better results 
model = RandomForestClassifier(n_estimators = 501, 
							criterion = 'entropy') 
							
model.fit(X_train, y_train) 

Step 7: Pridicting Final Results via using .predict() method with attribute X_test

# Predicting the Test set results 
y_pred = model.predict(X_test) 

y_pred 

Note: Accuracy with random forest was 72%.(It may be different when performed experiment with different test size, here = 0.25).

Step 8: To know the accuracy, confusion matrix is needed.

Confusion Matrix is a 2X2 Matrix.

TRUE POSITIVE : measures the proportion of actual positives that are correctly identified.
TRUE NEGATIVE : measures the proportion of actual positives that are not correctly identified.
FALSE POSITIVE : measures the proportion of actual negatives that are correctly identified.
FALSE NEGATIVE : measures the proportion of actual negatives that are not correctly identified.

Note : True or False refers to the assigned classification being Correct or Incorrect, while Positive or Negative refers to assignment to the Positive or the Negative Category

# Making the Confusion Matrix 
from sklearn.metrics import confusion_matrix 

cm = confusion_matrix(y_test, y_pred) 

cm 

Thank for reading !

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Description
Learn Hands-On Python Programming By Creating Projects, GUIs and Graphics

Python is a dynamic modern object -oriented programming language
It is easy to learn and can be used to do a lot of things both big and small
Python is what is referred to as a high level language
Python is used in the industry for things like embedded software, web development, desktop applications, and even mobile apps!
SQL-Lite allows your applications to become even more powerful by storing, retrieving, and filtering through large data sets easily
If you want to learn to code, Python GUIs are the best way to start!

I designed this programming course to be easily understood by absolute beginners and young people. We start with basic Python programming concepts. Reinforce the same by developing Project and GUIs.

Why Python?

The Python coding language integrates well with other platforms – and runs on virtually all modern devices. If you’re new to coding, you can easily learn the basics in this fast and powerful coding environment. If you have experience with other computer languages, you’ll find Python simple and straightforward. This OSI-approved open-source language allows free use and distribution – even commercial distribution.

When and how do I start a career as a Python programmer?

In an independent third party survey, it has been revealed that the Python programming language is currently the most popular language for data scientists worldwide. This claim is substantiated by the Institute of Electrical and Electronic Engineers, which tracks programming languages by popularity. According to them, Python is the second most popular programming language this year for development on the web after Java.

Python Job Profiles
Software Engineer
Research Analyst
Data Analyst
Data Scientist
Software Developer
Python Salary

The median total pay for Python jobs in California, United States is $74,410, for a professional with one year of experience
Below are graphs depicting average Python salary by city
The first chart depicts average salary for a Python professional with one year of experience and the second chart depicts the average salaries by years of experience
Who Uses Python?

This course gives you a solid set of skills in one of today’s top programming languages. Today’s biggest companies (and smartest startups) use Python, including Google, Facebook, Instagram, Amazon, IBM, and NASA. Python is increasingly being used for scientific computations and data analysis
Take this course today and learn the skills you need to rub shoulders with today’s tech industry giants. Have fun, create and control intriguing and interactive Python GUIs, and enjoy a bright future! Best of Luck
Who is the target audience?

Anyone who wants to learn to code
For Complete Programming Beginners
For People New to Python
This course was designed for students with little to no programming experience
People interested in building Projects
Anyone looking to start with Python GUI development
Basic knowledge
Access to a computer
Download Python (FREE)
Should have an interest in programming
Interest in learning Python programming
Install Python 3.6 on your computer
What will you learn
Build Python Graphical User Interfaces(GUI) with Tkinter
Be able to use the in-built Python modules for their own projects
Use programming fundamentals to build a calculator
Use advanced Python concepts to code
Build Your GUI in Python programming
Use programming fundamentals to build a Project
Signup Login & Registration Programs
Quizzes
Assignments
Job Interview Preparation Questions
& Much More