Python Regex Tutorial with Example

What is Regular Expression?

A regular expression in a programming language is a special text string used for describing a search pattern. It is extremely useful for extracting information from text such as code, files, log, spreadsheets or even documents.

While using the regular expression the first thing is to recognize is that everything is essentially a character, and we are writing patterns to match a specific sequence of characters also referred as string. Ascii or latin letters are those that are on your keyboards and Unicode is used to match the foreign text. It includes digits and punctuation and all special characters like $#@!%, etc.

In this tutorial, we will learn:

  • Regular Expression Syntax
  • Example of w+ and ^ Expression
  • Example of \s expression in re.split function
  • Using regular expression methods
  • Using re.match()
  • Finding Pattern in Text (re.search())
  • Using re.findall for text
  • Python Flags
  • Example of re.M or Multiline Flags

For instance, a regular expression could tell a program to search for specific text from the string and then to print out the result accordingly. Expression can include

  • Text matching
  • Repetition
  • Branching
  • Pattern-composition etc.

In Python, a regular expression is denoted as RE (REs, regexes or regex pattern) are imported through re module. Python supports regular expression through libraries. In Python regular expression supports various things like Modifiers, Identifiers, and White space characters.

Regular Expression Syntax

RE

import re
  • “re” module included with Python primarily used for string searching and manipulation
  • Also used frequently for web page “Scraping” (extract large amount of data from websites)

We will begin the expression tutorial with this simple exercise by using the expressions (w+) and (^).

Example of w+ and ^ Expression

  • “^”: This expression matches the start of a string
  • "w+": This expression matches the alphanumeric character in the string

Here we will see an example of how we can use w+ and ^ expression in our code. We cover re.findall function later in this tutorial but for a while we simply focus on \w+ and ^ expression.

For example, for our string “guru99, education is fun” if we execute the code with w+ and^, it will give the output “guru99”.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+",xx)
print(r1)

Remember, if you remove +sign from the w+, the output will change, and it will only give the first character of the first letter, i.e., [g]

Example of \s expression in re.split function

  • “s”: This expression is used for creating a space in the string

To understand how this regular expression works in Python, we begin with a simple example of a split function. In the example, we have split each word using the “re.split” function and at the same time we have used expression \s that allows to parse each word in the string separately.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

When you execute this code it will give you the output [‘we’, ‘are’, ‘splitting’, ‘the’, ‘words’].

Now, let see what happens if you remove “” from s. There is no ‘s’ alphabet in the output, this is because we have removed ‘’ from the string, and it evaluates “s” as a regular character and thus split the words wherever it finds “s” in the string.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

Similarly, there are series of other regular expressions in Python that you can use in various ways in Python like \d,\D,$,.,\b, etc.

Here is the complete code

import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+", xx)
print((re.split(r'\s','we are splitting the words')))
print((re.split(r's','split the words')))

Next, we will going to see the types of methods that are used with regular expressions.

Using regular expression methods

The “re” package provides several methods to actually perform queries on an input string. The method we going to see are

  • re.match()
  • re.search()
  • re.findall()

Note: Based on the regular expressions, Python offers two different primitive operations. The match method checks for a match only at the beginning of the string while search checks for a match anywhere in the string.

Using re.match()

The match function is used to match the RE pattern to string with optional flags. In this method, the expression “w+” and “\W” will match the words starting with letter ‘g’ and thereafter, anything which is not started with ‘g’ is not identified. To check match for each element in the list or string, we run the forloop.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

Finding Pattern in Text (re.search())

A regular expression is commonly used to search for a pattern in a text. This method takes a regular expression pattern and a string and searches for that pattern with the string.

In order to use search() function, you need to import re first and then execute the code. The search() function takes the “pattern” and “text” to scan from our main string and returns a match object when the pattern is found or else not match if the pattern is not found.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

For example here we look for two literal strings “Software testing” “guru99”, in a text string “Software Testing is fun”. For “software testing” we found the match hence it returns the output as “found a match”, while for word “guru99” we could not found in string hence it returns the output as “No match”.

re.findall()

findall() module is used to search for “all” occurrences that match a given pattern. In contrast, search() module will only return the first occurrence that matches the specified pattern. findall() will iterate over all the lines of the file and will return all non-overlapping matches of pattern in a single step.

For example, here we have a list of e-mail addresses, and we want all the e-mail addresses to be fetched out from the list, we use the re.findall method. It will find all the e-mail addresses from the list.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

Here is the complete code

import re

list = ["guru99 get", "guru99 give", "guru Selenium"]
for element in list:
    z = re.match("(g\w+)\W(g\w+)", element)
if z:
    print((z.groups()))
    
patterns = ['software testing', 'guru99']
text = 'software testing is fun?'
for pattern in patterns:
    print('Looking for "%s" in "%s" ->' % (pattern, text), end=' ')
    if re.search(pattern, text):
        print('found a match!')
else:
    print('no match')
abc = 'guru99@google.com
	, careerguru99@hotmail.com, users@yahoomail.com'
emails = re.findall(r'[\w\.-]+@[\w\.-]+', abc)
for email in emails:
    print(email)

Python Flags

Many Python Regex Methods and Regex functions take an optional argument called Flags. This flags can modify the meaning of the given Regex pattern. To understand these we will see one or two example of these Flags.

Various flags used in Python includes

Example of re.M or Multiline Flags

In multiline the pattern character [^] match the first character of the string and the beginning of each line (following immediately after the each newline). While expression small “w” is used to mark the space with characters. When you run the code the first variable “k1” only prints out the character ‘g’ for word guru99, while when you add multiline flag, it fetches out first characters of all the elements in the string.

Python Regex Tutorial: re.match(),re.search(), re.findall(), Flags

Here is the code

import re
xx = """guru99 
careerguru99	
selenium"""
k1 = re.findall(r"^\w", xx)
k2 = re.findall(r"^\w", xx, re.MULTILINE)
print(k1)
print(k2)
  • We declared the variable xx for string " guru99…. careerguru99….selenium"
  • Run the code without using flags multiline, it gives the output only ‘g’ from the lines
  • Run the code with flag “multiline”, when you print ‘k2’ it gives the output as ‘g’, ‘c’ and ‘s’
  • So, the difference we can see after and before adding multi-lines in above example.

Likewise, you can also use other Python flags like re.U (Unicode), re.L (Follow locale), re.X (Allow Comment), etc.

Python 2 Example

Above codes are Python 3 examples, If you want to run in Python 2 please consider following code.

# Example of w+ and ^ Expression
import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+",xx)
print r1

# Example of \s expression in re.split function
import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+", xx)
print (re.split(r'\s','we are splitting the words'))
print (re.split(r's','split the words'))

# Using re.findall for text
import re

list = ["guru99 get", "guru99 give", "guru Selenium"]
for element in list:
    z = re.match("(g\w+)\W(g\w+)", element)
if z:
    print(z.groups())
    
patterns = ['software testing', 'guru99']
text = 'software testing is fun?'
for pattern in patterns:
    print 'Looking for "%s" in "%s" ->' % (pattern, text),
    if re.search(pattern, text):
        print 'found a match!'
else:
    print 'no match'
abc = 'guru99@google.com, careerguru99@hotmail.com, users@yahoomail.com'
emails = re.findall(r'[\w\.-]+@[\w\.-]+', abc)
for email in emails:
    print email

# Example of re.M or Multiline Flags
import re
xx = """guru99 
careerguru99	
selenium"""
k1 = re.findall(r"^\w", xx)
k2 = re.findall(r"^\w", xx, re.MULTILINE)
print k1
print k2

Summary

A regular expression in a programming language is a special text string used for describing a search pattern. It includes digits and punctuation and all special characters like $#@!%, etc. Expression can include literal

  • Text matching
  • Repetition
  • Branching
  • Pattern-composition etc.

In Python, a regular expression is denoted as RE (REs, regexes or regex pattern) are embedded through re module.

  • “re” module included with Python primarily used for string searching and manipulation
  • Also used frequently for webpage “Scraping” (extract large amount of data from websites)
  • Regular Expression Methods include re.match(),re.search()& re.findall()
  • Python Flags Many Python Regex Methods and Regex functions take an optional argument called Flags
  • This flags can modify the meaning of the given Regex pattern
  • Various Python flags used in Regex Methods are re.M, re.I, re.S, etc.

Python Tutorial: re Module - How to Write and Match Regular Expressions (Regex)

In this Python Programming Tutorial, we will be learning how to read, write, and match regular expressions with the re module. Regular expressions are extremely useful for matching common patterns of text such as email addresses, phone numbers, URLs, etc. Learning how to do this within Python will allow us to quickly parse files and text for the information we need.


Python RegEx | Python Regular Expressions Tutorial | Python Tutorial | Python Training

This Edureka “Python RegEx” tutorial (Python Tutorial Blog: https://goo.gl/wd28Zr) will help you in understanding how to use regular expressions in Python. You will get to learn different regular expression operations and syntaxes. You will be learning how to implement all the regex operations in python practically.

Below are the topics covered in this tutorial:

  1. Why we use Regular Expressions?
  2. What are Regular Expressions?
  3. Basic Regular Expressions operations
  4. E-mail verification using Regular Expressions
  5. Phone number verification using Regular Expressions
  6. Web scraping using Regular Expressions

Regular Expressions (Regex) Tutorial: How to Match Any Pattern of Text

In this regular expressions (regex) tutorial, we’re going to be learning how to match patterns of text. Regular expressions are extremely useful for matching common patterns of text such as email addresses, phone numbers, URLs, etc. Almost every programming language has a regular expression library, so learning regular expressions with not only help you with finding patterns in your text editors, but also you’ll be able to use these programming libraries to search for patterns programmatically as well.


Python Regular Expressions (RegEx) | Regular Expressions In Python | Python Tutorial

This Python regular expressions tutorial will help you understand what are regular expressions and symbols for writing regular expressions along with a demo on how to use these. Regular expressions are a set of characters that helps one identity strings of a specific pattern. The history of regular expressions came from a number of other languages. Now, let us get started and understand how to use the regular expressions along with a demo.

Below topics are explained in this Python regular expressions tutorial:

  1. What are regular expressions? (00:15)
  2. Symbols for writing regular expressions (01:41)
  • $: Specifies that the match must occur at the end of the string
  • []: Matches one out all characters within the brackets
  • [^…]: Matches anyone characters except those not in the brackets
  • .: Represent ss single occurrence of any character except newline
  • ?: The preceding character is optional
  • ^: Specifies that the match must start at the beginning of the string

#python #regex #web-development

Python Regex Tutorial with Example
39.85 GEEK