How To Create Temperature Converter (°C to F & Vice Versa) with Python

In this video, I am going to show you how to create a simple program that converts Celsius to Fahrenheit and vice versa. If you find this video helpful or informative, consider leaving a like and subscribing to my channel. If you have any questions or suggestions on what video should I do next, feel free to comment down below.

Download link for Python: https://www.python.org/

Download link for Atom: https://atom.io/

#python #programming #developer

What is GEEK

Buddha Community

How To Create Temperature Converter (°C to F & Vice Versa) with Python
Easter  Deckow

Easter Deckow

1655630160

PyTumblr: A Python Tumblr API v2 Client

PyTumblr

Installation

Install via pip:

$ pip install pytumblr

Install from source:

$ git clone https://github.com/tumblr/pytumblr.git
$ cd pytumblr
$ python setup.py install

Usage

Create a client

A pytumblr.TumblrRestClient is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:

client = pytumblr.TumblrRestClient(
    '<consumer_key>',
    '<consumer_secret>',
    '<oauth_token>',
    '<oauth_secret>',
)

client.info() # Grabs the current user information

Two easy ways to get your credentials to are:

  1. The built-in interactive_console.py tool (if you already have a consumer key & secret)
  2. The Tumblr API console at https://api.tumblr.com/console
  3. Get sample login code at https://api.tumblr.com/console/calls/user/info

Supported Methods

User Methods

client.info() # get information about the authenticating user
client.dashboard() # get the dashboard for the authenticating user
client.likes() # get the likes for the authenticating user
client.following() # get the blogs followed by the authenticating user

client.follow('codingjester.tumblr.com') # follow a blog
client.unfollow('codingjester.tumblr.com') # unfollow a blog

client.like(id, reblogkey) # like a post
client.unlike(id, reblogkey) # unlike a post

Blog Methods

client.blog_info(blogName) # get information about a blog
client.posts(blogName, **params) # get posts for a blog
client.avatar(blogName) # get the avatar for a blog
client.blog_likes(blogName) # get the likes on a blog
client.followers(blogName) # get the followers of a blog
client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows
client.queue(blogName) # get the queue for a given blog
client.submission(blogName) # get the submissions for a given blog

Post Methods

Creating posts

PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.

The default supported types are described below.

  • state - a string, the state of the post. Supported types are published, draft, queue, private
  • tags - a list, a list of strings that you want tagged on the post. eg: ["testing", "magic", "1"]
  • tweet - a string, the string of the customized tweet you want. eg: "Man I love my mega awesome post!"
  • date - a string, the customized GMT that you want
  • format - a string, the format that your post is in. Support types are html or markdown
  • slug - a string, the slug for the url of the post you want

We'll show examples throughout of these default examples while showcasing all the specific post types.

Creating a photo post

Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload

#Creates a photo post using a source URL
client.create_photo(blogName, state="published", tags=["testing", "ok"],
                    source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg")

#Creates a photo post using a local filepath
client.create_photo(blogName, state="queue", tags=["testing", "ok"],
                    tweet="Woah this is an incredible sweet post [URL]",
                    data="/Users/johnb/path/to/my/image.jpg")

#Creates a photoset post using several local filepaths
client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown",
                    data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"],
                    caption="## Mega sweet kittens")

Creating a text post

Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html

#Creating a text post
client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")

Creating a quote post

Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported

#Creating a quote post
client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")

Creating a link post

  • title - a string, the title of post that you want. Supports HTML entities.
  • url - a string, the url that you want to create a link post for.
  • description - a string, the desciption of the link that you have
#Create a link post
client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com",
                   description="Search is pretty cool when a duck does it.")

Creating a chat post

Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)

#Create a chat post
chat = """John: Testing can be fun!
Renee: Testing is tedious and so are you.
John: Aw.
"""
client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])

Creating an audio post

Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr

#Creating an audio file
client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3")

#lets use soundcloud!
client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")

Creating a video post

Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload

#Creating an upload from YouTube
client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.",
                    embed="http://www.youtube.com/watch?v=40pUYLacrj4")

#Creating a video post from local file
client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")

Editing a post

Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.

client.edit_post(blogName, id=post_id, type="text", title="Updated")
client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")

Reblogging a Post

Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.

client.reblog(blogName, id=125356, reblog_key="reblog_key")

Deleting a post

Deleting just requires that you own the post and have the post id

client.delete_post(blogName, 123456) # Deletes your post :(

A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):

client.create_text(blogName, tags=['hello', 'world'], ...)

Getting notes for a post

In order to get the notes for a post, you need to have the post id and the blog that it is on.

data = client.notes(blogName, id='123456')

The results include a timestamp you can use to make future calls.

data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])

Tagged Methods

# get posts with a given tag
client.tagged(tag, **params)

Using the interactive console

This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).

You'll need pyyaml installed to run it, but then it's just:

$ python interactive-console.py

and away you go! Tokens are stored in ~/.tumblr and are also shared by other Tumblr API clients like the Ruby client.

Running tests

The tests (and coverage reports) are run with nose, like this:

python setup.py test

Author: tumblr
Source Code: https://github.com/tumblr/pytumblr
License: Apache-2.0 license

#python #api 

Tamale  Moses

Tamale Moses

1669003576

Exploring Mutable and Immutable in Python

In this Python article, let's learn about Mutable and Immutable in Python. 

Mutable and Immutable in Python

Mutable is a fancy way of saying that the internal state of the object is changed/mutated. So, the simplest definition is: An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created.

Both of these states are integral to Python data structure. If you want to become more knowledgeable in the entire Python Data Structure, take this free course which covers multiple data structures in Python including tuple data structure which is immutable. You will also receive a certificate on completion which is sure to add value to your portfolio.

Mutable Definition

Mutable is when something is changeable or has the ability to change. In Python, ‘mutable’ is the ability of objects to change their values. These are often the objects that store a collection of data.

Immutable Definition

Immutable is the when no change is possible over time. In Python, if the value of an object cannot be changed over time, then it is known as immutable. Once created, the value of these objects is permanent.

List of Mutable and Immutable objects

Objects of built-in type that are mutable are:

  • Lists
  • Sets
  • Dictionaries
  • User-Defined Classes (It purely depends upon the user to define the characteristics) 

Objects of built-in type that are immutable are:

  • Numbers (Integer, Rational, Float, Decimal, Complex & Booleans)
  • Strings
  • Tuples
  • Frozen Sets
  • User-Defined Classes (It purely depends upon the user to define the characteristics)

Object mutability is one of the characteristics that makes Python a dynamically typed language. Though Mutable and Immutable in Python is a very basic concept, it can at times be a little confusing due to the intransitive nature of immutability.

Objects in Python

In Python, everything is treated as an object. Every object has these three attributes:

  • Identity – This refers to the address that the object refers to in the computer’s memory.
  • Type – This refers to the kind of object that is created. For example- integer, list, string etc. 
  • Value – This refers to the value stored by the object. For example – List=[1,2,3] would hold the numbers 1,2 and 3

While ID and Type cannot be changed once it’s created, values can be changed for Mutable objects.

Check out this free python certificate course to get started with Python.

Mutable Objects in Python

I believe, rather than diving deep into the theory aspects of mutable and immutable in Python, a simple code would be the best way to depict what it means in Python. Hence, let us discuss the below code step-by-step:

#Creating a list which contains name of Indian cities  

cities = [‘Delhi’, ‘Mumbai’, ‘Kolkata’]

# Printing the elements from the list cities, separated by a comma & space

for city in cities:
		print(city, end=’, ’)

Output [1]: Delhi, Mumbai, Kolkata

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [2]: 0x1691d7de8c8

#Adding a new city to the list cities

cities.append(‘Chennai’)

#Printing the elements from the list cities, separated by a comma & space 

for city in cities:
	print(city, end=’, ’)

Output [3]: Delhi, Mumbai, Kolkata, Chennai

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [4]: 0x1691d7de8c8

The above example shows us that we were able to change the internal state of the object ‘cities’ by adding one more city ‘Chennai’ to it, yet, the memory address of the object did not change. This confirms that we did not create a new object, rather, the same object was changed or mutated. Hence, we can say that the object which is a type of list with reference variable name ‘cities’ is a MUTABLE OBJECT.

Let us now discuss the term IMMUTABLE. Considering that we understood what mutable stands for, it is obvious that the definition of immutable will have ‘NOT’ included in it. Here is the simplest definition of immutable– An object whose internal state can NOT be changed is IMMUTABLE.

Again, if you try and concentrate on different error messages, you have encountered, thrown by the respective IDE; you use you would be able to identify the immutable objects in Python. For instance, consider the below code & associated error message with it, while trying to change the value of a Tuple at index 0. 

#Creating a Tuple with variable name ‘foo’

foo = (1, 2)

#Changing the index[0] value from 1 to 3

foo[0] = 3
	
TypeError: 'tuple' object does not support item assignment 

Immutable Objects in Python

Once again, a simple code would be the best way to depict what immutable stands for. Hence, let us discuss the below code step-by-step:

#Creating a Tuple which contains English name of weekdays

weekdays = ‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’

# Printing the elements of tuple weekdays

print(weekdays)

Output [1]:  (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [2]: 0x1691cc35090

#tuples are immutable, so you cannot add new elements, hence, using merge of tuples with the # + operator to add a new imaginary day in the tuple ‘weekdays’

weekdays  +=  ‘Pythonday’,

#Printing the elements of tuple weekdays

print(weekdays)

Output [3]: (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’, ‘Pythonday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [4]: 0x1691cc8ad68

This above example shows that we were able to use the same variable name that is referencing an object which is a type of tuple with seven elements in it. However, the ID or the memory location of the old & new tuple is not the same. We were not able to change the internal state of the object ‘weekdays’. The Python program manager created a new object in the memory address and the variable name ‘weekdays’ started referencing the new object with eight elements in it.  Hence, we can say that the object which is a type of tuple with reference variable name ‘weekdays’ is an IMMUTABLE OBJECT.

Also Read: Understanding the Exploratory Data Analysis (EDA) in Python

Where can you use mutable and immutable objects:

Mutable objects can be used where you want to allow for any updates. For example, you have a list of employee names in your organizations, and that needs to be updated every time a new member is hired. You can create a mutable list, and it can be updated easily.

Immutability offers a lot of useful applications to different sensitive tasks we do in a network centred environment where we allow for parallel processing. By creating immutable objects, you seal the values and ensure that no threads can invoke overwrite/update to your data. This is also useful in situations where you would like to write a piece of code that cannot be modified. For example, a debug code that attempts to find the value of an immutable object.

Watch outs:  Non transitive nature of Immutability:

OK! Now we do understand what mutable & immutable objects in Python are. Let’s go ahead and discuss the combination of these two and explore the possibilities. Let’s discuss, as to how will it behave if you have an immutable object which contains the mutable object(s)? Or vice versa? Let us again use a code to understand this behaviour–

#creating a tuple (immutable object) which contains 2 lists(mutable) as it’s elements

#The elements (lists) contains the name, age & gender 

person = (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the tuple

print(person)

Output [1]: (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [2]: 0x1691ef47f88

#Changing the age for the 1st element. Selecting 1st element of tuple by using indexing [0] then 2nd element of the list by using indexing [1] and assigning a new value for age as 4

person[0][1] = 4

#printing the updated tuple

print(person)

Output [3]: (['Ayaan', 4, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [4]: 0x1691ef47f88

In the above code, you can see that the object ‘person’ is immutable since it is a type of tuple. However, it has two lists as it’s elements, and we can change the state of lists (lists being mutable). So, here we did not change the object reference inside the Tuple, but the referenced object was mutated.

Also Read: Real-Time Object Detection Using TensorFlow

Same way, let’s explore how it will behave if you have a mutable object which contains an immutable object? Let us again use a code to understand the behaviour–

#creating a list (mutable object) which contains tuples(immutable) as it’s elements

list1 = [(1, 2, 3), (4, 5, 6)]

#printing the list

print(list1)

Output [1]: [(1, 2, 3), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [2]: 0x1691d5b13c8	

#changing object reference at index 0

list1[0] = (7, 8, 9)

#printing the list

Output [3]: [(7, 8, 9), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [4]: 0x1691d5b13c8

As an individual, it completely depends upon you and your requirements as to what kind of data structure you would like to create with a combination of mutable & immutable objects. I hope that this information will help you while deciding the type of object you would like to select going forward.

Before I end our discussion on IMMUTABILITY, allow me to use the word ‘CAVITE’ when we discuss the String and Integers. There is an exception, and you may see some surprising results while checking the truthiness for immutability. For instance:
#creating an object of integer type with value 10 and reference variable name ‘x’ 

x = 10
 

#printing the value of ‘x’

print(x)

Output [1]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(x)))

Output [2]: 0x538fb560

#creating an object of integer type with value 10 and reference variable name ‘y’

y = 10

#printing the value of ‘y’

print(y)

Output [3]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(y)))

Output [4]: 0x538fb560

As per our discussion and understanding, so far, the memory address for x & y should have been different, since, 10 is an instance of Integer class which is immutable. However, as shown in the above code, it has the same memory address. This is not something that we expected. It seems that what we have understood and discussed, has an exception as well.

Quick checkPython Data Structures

Immutability of Tuple

Tuples are immutable and hence cannot have any changes in them once they are created in Python. This is because they support the same sequence operations as strings. We all know that strings are immutable. The index operator will select an element from a tuple just like in a string. Hence, they are immutable.

Exceptions in immutability

Like all, there are exceptions in the immutability in python too. Not all immutable objects are really mutable. This will lead to a lot of doubts in your mind. Let us just take an example to understand this.

Consider a tuple ‘tup’.

Now, if we consider tuple tup = (‘GreatLearning’,[4,3,1,2]) ;

We see that the tuple has elements of different data types. The first element here is a string which as we all know is immutable in nature. The second element is a list which we all know is mutable. Now, we all know that the tuple itself is an immutable data type. It cannot change its contents. But, the list inside it can change its contents. So, the value of the Immutable objects cannot be changed but its constituent objects can. change its value.

FAQs

1. Difference between mutable vs immutable in Python?

Mutable ObjectImmutable Object
State of the object can be modified after it is created.State of the object can’t be modified once it is created.
They are not thread safe.They are thread safe
Mutable classes are not final.It is important to make the class final before creating an immutable object.

2. What are the mutable and immutable data types in Python?

  • Some mutable data types in Python are:

list, dictionary, set, user-defined classes.

  • Some immutable data types are: 

int, float, decimal, bool, string, tuple, range.

3. Are lists mutable in Python?

Lists in Python are mutable data types as the elements of the list can be modified, individual elements can be replaced, and the order of elements can be changed even after the list has been created.
(Examples related to lists have been discussed earlier in this blog.)

4. Why are tuples called immutable types?

Tuple and list data structures are very similar, but one big difference between the data types is that lists are mutable, whereas tuples are immutable. The reason for the tuple’s immutability is that once the elements are added to the tuple and the tuple has been created; it remains unchanged.

A programmer would always prefer building a code that can be reused instead of making the whole data object again. Still, even though tuples are immutable, like lists, they can contain any Python object, including mutable objects.

5. Are sets mutable in Python?

A set is an iterable unordered collection of data type which can be used to perform mathematical operations (like union, intersection, difference etc.). Every element in a set is unique and immutable, i.e. no duplicate values should be there, and the values can’t be changed. However, we can add or remove items from the set as the set itself is mutable.

6. Are strings mutable in Python?

Strings are not mutable in Python. Strings are a immutable data types which means that its value cannot be updated.

Join Great Learning Academy’s free online courses and upgrade your skills today.


Original article source at: https://www.mygreatlearning.com

#python 

Ray  Patel

Ray Patel

1619510796

Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

Edward Jackson

Edward Jackson

1658802077

What is Bag of Words (BoW)? BoW Explained with Examples

In this Natural Language Processing (NLP) tutorial, you'll learn what Bag of Words (BoW) is, why BoW is used, learn about it’s implementation in Python and more.

  1. What is Bag of Words in NLP?
  2. Why is the Bag of Words algorithm used?
  3. Understanding Bag of Words with an example
  4. Implementing Bag of Words with Python
  5. Create a Bag of Words Model with Sklearn
  6. What are N-Grams?
  7. What is Tf-Idf ( term frequency-inverse document frequency)?
  8. Feature Extraction with Tf-Idf vectorizer
  9. Limitations of Bag of Word

Using Natural Language Processing, we make use of the text data available across the internet to generate insights for the business. In order to understand this huge amount of data and make insights from them, we need to make them usable. Natural language processing helps us to do so.

What is a Bag of Words in NLP?

Bag of words is a Natural Language Processing technique of text modelling. In technical terms, we can say that it is a method of feature extraction with text data. This approach is a simple and flexible way of extracting features from documents.

A bag of words is a representation of text that describes the occurrence of words within a document. We just keep track of word counts and disregard the grammatical details and the word order. It is called a “bag” of words because any information about the order or structure of words in the document is discarded. The model is only concerned with whether known words occur in the document, not where in the document.
 

Why is the Bag-of-Words algorithm used?

So, why bag-of-words, what is wrong with the simple and easy text?  

One of the biggest problems with text is that it is messy and unstructured, and machine learning algorithms prefer structured, well defined fixed-length inputs and by using the Bag-of-Words technique we can convert variable-length texts into a fixed-length vector.

Also, at a much granular level, the machine learning models work with numerical data rather than textual data. So to be more specific, by using the bag-of-words (BoW) technique, we convert a text into its equivalent vector of numbers.

Understanding Bag of Words with an example

Let us see an example of how the bag of words technique converts text into vectors

Example(1) without preprocessing: 

Sentence 1:  ”Welcome to Great Learning, Now start learning”

Sentence 2: “Learning is a good practice”

Sentence 1Sentence 2
WelcomeLearning
tois
Greata
Learninggood
,practice
Now 
start 
learning 

Step 1: Go through all the words in the above text and make a list of all of the words in our model vocabulary.

  • Welcome
  • To
  • Great
  • Learning
  • ,
  • Now
  • start
  • learning
  • is
  • a
  • good
  • practice

Note that the words ‘Learning’ and ‘ learning’ are not the same here because of the difference in their cases and hence are repeated. Also, note that a comma ‘ , ’ is also taken in the list.

Because we know the vocabulary has 12 words, we can use a fixed-length document-representation of 12, with one position in the vector to score each word.

The scoring method we use here is to count the presence of each word and mark 0 for absence. This scoring method is used more generally.

The scoring of sentence 1 would look as follows:

WordFrequency
Welcome1
to1
Great1
Learning1
,1
Now1
start1
learning1
is0
a0
good0
practice0

Writing the above frequencies in the vector 

Sentence 1 ➝ [ 1,1,1,1,1,1,1,1,0,0,0 ]

Now for sentence 2, the scoring would like 

WordFrequency
Welcome0
to0
Great0
Learning1
,0
Now0
start0
learning0
is1
a1
good1
practice1

Similarly, writing the above frequencies in the vector form

Sentence 2 ➝ [ 0,0,0,0,0,0,0,1,1,1,1,1 ]
 

SentenceWelcometoGreatLearning,Nowstart learningisagoodpractice
Sentence1111111110000
Sentence2000000011111

But is this the best way to perform a bag of words. The above example was not the best example of how to use a bag of words. The words Learning and learning, although having the same meaning are taken twice. Also, a comma ’,’ which does not convey any information is also included in the vocabulary.

Let us make some changes and see how we can use ‘bag of words in a more effective way.

Example(2) with preprocessing

Sentence 1: ”Welcome to Great Learning, Now start learning”

Sentence 2: “Learning is a good practice”
 

Step 1: Convert the above sentences in lower case as the case of the word does not hold any information.

Step 2: Remove special characters and stopwords from the text. Stopwords are the words that do not contain much information about text like ‘is’, ‘a’,’the and many more’.

After applying the above steps, the sentences are changed to

Sentence 1:  ”welcome great learning now start learning”

Sentence 2: “learning good practice”

Although the above sentences do not make much sense the maximum information is contained in these words only.

Step 3: Go through all the words in the above text and make a list of all of the words in our model vocabulary.

  • welcome
  • great
  • learning
  • now
  • start
  • good
  • practice

Now as the vocabulary has only 7 words, we can use a fixed-length document-representation of 7, with one position in the vector to score each word.

The scoring method we use here is the same as used in the previous example. For sentence 1, the count of words is as follow:

WordFrequency
welcome1
great1
learning2
now1
start1
good0
practice0

Writing the above frequencies in the vector 
 

Sentence 1 ➝ [ 1,1,2,1,1,0,0 ]
 

Now for sentence 2, the scoring would be like 

WordFrequency
welcome0
great0
learning1
now0
start0
good1
practice1

Similarly, writing the above frequencies in the vector form

Sentence 2 ➝ [ 0,0,1,0,0,1,1 ]
 

Sentencewelcomegreatlearningnowstart goodpractice
Sentence11121100
Sentence20010011

The approach used in example two is the one that is generally used in the Bag-of-Words technique, the reason being that the datasets used in Machine learning are tremendously large and can contain vocabulary of a few thousand or even millions of words. Hence, preprocessing the text before using bag-of-words is a better way to go.

In the examples above we use all the words from vocabulary to form a vector, which is neither a practical way nor the best way to implement the BoW model. In practice, only a few words from the vocabulary, more preferably most common words are used to form the vector. 

Implementing Bag of Words Algorithm with Python

In this section, we are going to implement a bag of words algorithm with Python. Also, this is a very basic implementation to understand how bag of words algorithm work, so I would not recommend using this in your project, instead use the method described in the next section.

def vectorize(tokens):
    ''' This function takes list of words in a sentence as input 
    and returns a vector of size of filtered_vocab.It puts 0 if the 
    word is not present in tokens and count of token if present.'''
    vector=[]
    for w in filtered_vocab:
        vector.append(tokens.count(w))
    return vector
def unique(sequence):
    '''This functions returns a list in which the order remains 
    same and no item repeats.Using the set() function does not 
    preserve the original ordering,so i didnt use that instead'''
    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]
#create a list of stopwords.You can import stopwords from nltk too
stopwords=["to","is","a"]
#list of special characters.You can use regular expressions too
special_char=[",",":"," ",";",".","?"]
#Write the sentences in the corpus,in our case, just two 
string1="Welcome to Great Learning , Now start learning"
string2="Learning is a good practice"
#convert them to lower case
string1=string1.lower()
string2=string2.lower()
#split the sentences into tokens
tokens1=string1.split()
tokens2=string2.split()
print(tokens1)
print(tokens2)
#create a vocabulary list
vocab=unique(tokens1+tokens2)
print(vocab)
#filter the vocabulary list
filtered_vocab=[]
for w in vocab: 
    if w not in stopwords and w not in special_char: 
        filtered_vocab.append(w)
print(filtered_vocab)
#convert sentences into vectords
vector1=vectorize(tokens1)
print(vector1)
vector2=vectorize(tokens2)
print(vector2)

Output:

Create a Bag of Words Model with Sklearn

We can use the CountVectorizer() function from the Sk-learn library to easily implement the above BoW model using Python.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
 
sentence_1="This is a good job.I will not miss it for anything"
sentence_2="This is not good at all"
 
 
 
CountVec = CountVectorizer(ngram_range=(1,1), # to use bigrams ngram_range=(2,2)
                           stop_words='english')
#transform
Count_data = CountVec.fit_transform([sentence_1,sentence_2])
 
#create dataframe
cv_dataframe=pd.DataFrame(Count_data.toarray(),columns=CountVec.get_feature_names())
print(cv_dataframe)

What are N-Grams?

Again same questions, what are n-grams and why do we use them? Let us understand this with an example below-

Sentence 1: “This is a good job. I will not miss it for anything”

Sentence 2: ”This is not good at all”

For this example, let us take the vocabulary of 5 words only. The five words being-

  • good
  • job
  • miss
  • not
  • all

So, the respective vectors for these sentences are:

“This is a good job. I will not miss it for anything”=[1,1,1,1,0]

”This is not good at all”=[1,0,0,1,1]

Can you guess what is the problem here? Sentence 2 is a negative sentence and sentence 1 is a positive sentence. Does this reflect in any way in the vectors above? Not at all. So how can we solve this problem? Here come the N-grams to our rescue.

An N-gram is an N-token sequence of words: a 2-gram (more commonly called a bigram) is a two-word sequence of words like “really good”, “not good”, or “your homework”, and a 3-gram (more commonly called a trigram) is a three-word sequence of words like “not at all”, or “turn off light”.

For example, the bigrams in the first line of text in the previous section: “This is not good at all” are as follows:

  • “This is”
  • “is not”
  • “not good”
  • “good at”
  • “at all”

Now if instead of using just words in the above example, we use bigrams (Bag-of-bigrams) as shown above. The model can differentiate between sentence 1 and sentence 2. So, using bi-grams makes tokens more understandable (for example, “HSR Layout”, in Bengaluru, is more informative than “HSR” and “layout”)

So we can conclude that a bag-of-bigrams representation is much more powerful than bag-of-words, and in many cases proves very hard to beat.

What is Tf-Idf ( term frequency-inverse document frequency)?

The scoring method being used above takes the count of each word and represents the word in the vector by the number of counts of that particular word. What does a word having high word count signify?

Does this mean that the word is important in retrieving information about documents? The answer is NO. Let me explain, if a word occurs many times in a document but also along with many other documents in our dataset, maybe it is because this word is just a frequent word; not because it is relevant or meaningful.

One approach is to rescale the frequency of words by how often they appear in all documents so that the scores for frequent words like “the” that are also frequent across all documents are penalized. This approach is called term frequency-inverse document frequency or shortly known as Tf-Idf approach of scoring.TF-IDF is intended to reflect how relevant a term is in a given document. So how is Tf-Idf of a document in a dataset calculated?

TF-IDF for a word in a document is calculated by multiplying two different metrics:

The term frequency (TF) of a word in a document. There are several ways of calculating this frequency, with the simplest being a raw count of instances a word appears in a document. Then, there are other ways to adjust the frequency. For example, by dividing the raw count of instances of a word by either length of the document, or by the raw frequency of the most frequent word in the document. The formula to calculate Term-Frequency is

TF(i,j)=n(i,j)/Σ n(i,j)

Where,

n(i,j )= number of times nth word  occurred in a document
Σn(i,j) = total number of words in a document. 

The inverse document frequency(IDF) of the word across a set of documents. This suggests how common or rare a word is in the entire document set. The closer it is to 0, the more common is the word. This metric can be calculated by taking the total number of documents, dividing it by the number of documents that contain a word, and calculating the logarithm.

So, if the word is very common and appears in many documents, this number will approach 0. Otherwise, it will approach 1.

Multiplying these two numbers results in the TF-IDF score of a word in a document. The higher the score, the more relevant that word is in that particular document.

To put it in mathematical terms, the TF-IDF score is calculated as follows:

IDF=1+log(N/dN)

Where

N=Total number of documents in the dataset
dN=total number of documents in which nth word occur 

Also, note that the 1 added in the above formula is so that terms with zero IDF don’t get suppressed entirely. This process is known as IDF smoothing.

The TF-IDF is obtained by 

TF-IDF=TF*IDF

Does this seem too complicated? Don’t worry, this can be attained with just a few lines of code and you don’t even have to remember these scary formulas.

Feature Extraction with Tf-Idf vectorizer

We can use the TfidfVectorizer() function from the Sk-learn library to easily implement the above BoW(Tf-IDF), model.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
 
sentence_1="This is a good job.I will not miss it for anything"
sentence_2="This is not good at all"
 
 
 
#without smooth IDF
print("Without Smoothing:")
#define tf-idf
tf_idf_vec = TfidfVectorizer(use_idf=True, 
                        smooth_idf=False,  
                        ngram_range=(1,1),stop_words='english') # to use only  bigrams ngram_range=(2,2)
#transform
tf_idf_data = tf_idf_vec.fit_transform([sentence_1,sentence_2])
 
#create dataframe
tf_idf_dataframe=pd.DataFrame(tf_idf_data.toarray(),columns=tf_idf_vec.get_feature_names())
print(tf_idf_dataframe)
print("\n")
 
#with smooth
tf_idf_vec_smooth = TfidfVectorizer(use_idf=True,  
                        smooth_idf=True,  
                        ngram_range=(1,1),stop_words='english')
 
 
tf_idf_data_smooth = tf_idf_vec_smooth.fit_transform([sentence_1,sentence_2])
 
print("With Smoothing:")
tf_idf_dataframe_smooth=pd.DataFrame(tf_idf_data_smooth.toarray(),columns=tf_idf_vec_smooth.get_feature_names())
print(tf_idf_dataframe_smooth)

Limitations of Bag-of-Words

Although Bag-of-Words is quite efficient and easy to implement, still there are some disadvantages to this technique which are given below:

  1. The model ignores the location information of the word. The location information is a piece of very important information in the text. For example  “today is off” and “Is today off”, have the exact same vector representation in the BoW model.
  2. Bag of word models doesn’t respect the semantics of the word. For example, words ‘soccer’ and ‘football’ are often used in the same context. However, the vectors corresponding to these words are quite different in the bag of words model. The problem becomes more serious while modeling sentences. Ex: “Buy used cars” and “Purchase old automobiles” are represented by totally different vectors in the Bag-of-words model.
  3. The range of vocabulary is a big issue faced by the Bag-of-Words model. For example, if the model comes across a new word it has not seen yet, rather we say a rare, but informative word like Biblioklept(means one who steals books). The BoW model will probably end up ignoring this word as this word has not been seen by the model yet.

Original article source at https://www.mygreatlearning.com

#bagofwords #python #datascience #nlp 

Shardul Bhatt

Shardul Bhatt

1626775355

Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.

Summary

Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development