1. Matching a word irrespective of its case

Sometimes in a text, the same word can be written in different ways. This is most commonly the case with proper nouns. Instead of starting with an uppercase letter, sometimes they are written in all lowercase letters.

1

$ grep "[Jj]ayant"

Grep Case

Both the versions of the word, irrespective of their case have been matched.

Another interesting case can be observed with the word ‘IoT’. A word like this might occur several times across the text with different variations. to match all the words irrespective of the case use :

1

$ grep "[iI][oO][tT]"

Grep Iot 1

2. Matching mobile number using regex with grep

Regular expressions can be used to extract mobile number from a text.

The format of the mobile number has to be known beforehand. For example, a regular expression designed to match mobile numbers won’t work for home telephone numbers.

In this example, mobile number which is in the following format: 91-1234567890 (i.e TwoDigit-TenDigit) will be matched.

1

$ grep "[[:digit:]]\{2\}[ -]\?[[:digit:]]\{10\}"

Grep Phone Number

As is evident, only the mobile number in the above-mentioned format is matched.

3. Match email-address

Extracting email address out of a text is very useful and can be achieved using grep.

An email address has a particular format. The part before the ‘@’ is the username that identifies the mailbox. Then there is a domain like gmail.com or yahoo.in.

The regular expression can be designed keeping these things in mind.

1

$ grep -E "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}"

Input File For EmailInput File For Email

Grep Emailgrep command on input.txt

  • [A-Za-z0-9._%±]+ captures the username before ‘@’
  • [A-Za-z0-9.-]+ captures the name of the domain without the ‘.com’ part
  • .[A-Za-z]{2,6} captures the ‘.com’ or ‘.in’ etc.

4. URL checker

A URL has a particular format of representation. A regex can be built that verifies if a URL is in proper form or not.

A URL must start with http/https/ftp followed by ‘://’. Then there is the domain name which can end with ‘.com’, ‘.in’, ‘.org’ etc.

1

$ grep -E "^(http|https|ftp):[\/]{2}([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4})"

Input Text For Domain 1Input Text For domain.txt

Grep On Domain 1Grep On domain.txt

-E used in this example and the previous signifies extended grep which uses Extended Regular Expression set instead of Basic Regular Expression set. This means that certain special characters are not required to be escaped. It makes the process of writing a complex regex less tiresome. Read more about it

#unix/linux #regex

10 practical examples of regex with grep - JournalDev
1.40 GEEK