Introduction

Regular expressions or regex are a sequence of characters used to check whether a pattern exists in each text (string) or not, for example, to find out if “123” exists in “Practical123DataScie”. The regex parser interprets “123” as an ordinary character that matches only itself in the string. But the real power of regular expressions is when a pattern contains special characters called metacharacters. These have a unique meaning to the regex matching engine and vastly enhance the capability of the search.

Regex functionality resides in a module named _re_. So, like all modules in Python, we only need to import it as follows to start working with.

import re

Very useful functions in re module are covered in this tutorial, such as search() and split() for search and replace, respectively. You also will learn to create complex matching patterns with metacharacters.

I) .search() function

A regular expression search is typically written as:

re.search(pattern, string)

This function goes through the string to locate the first location where there is a match with the pattern. If there is no match, it returns None. Let us look at the following example:

s1= "Practical123DataScie"
re.search("123", s1)

Output: <re.Match object; span=(9, 12), match='123'>

The output provides you a lot of information. It tells you that there is a match and locates at s[9:12] of the string. This is an easy case and we might need to search for complex patterns. Imagine now, you want to look for three consecutive numbers like “456” or “789”. In this case, we would need to use patterns because we are looking for consecutive numbers and we do not know exactly what those numbers are. They could be “124”, “052” and so on. How can we do that?

s2 = “PracticalDataScie052”
re.search(‘[0–9][0–9][0–9]’, s2)

Output: <re.Match object; span=(17, 20), match='052'>

There are a lot of concepts here to talk about. The pattern used here is ‘[0-9][0-9][0-9]’. First, let us talk about square brackets ([]). Regular expression or pattern […] tells you to match any single character in square brackets. For example:

re.search(‘[0]’, s2)

Output: <re.Match object; span=(17, 18), match='0'>

#machine-learning #python #text-mining #express

Regular expressions in Python
1.30 GEEK