Regex in Ruby - The Beginner's Guide

If you’re new to Ruby (or any programming language), you may have come across these strange bits of coding when you search stack overflow for answers to life’s greatest questions — like how to count the number of sentences in a string:

string.strip.split(/\w[?!.]/)

You may be wondering — what is this code within a code?

It’s called Regex — short for Regular Expressions! It’s not unique to Ruby (many languages have some form of regex), but we’ll focus on Ruby’s version here.

Regex code is used for specifying a certain search pattern of characters to be matched in a string (like finding words that end with -ing, or places in a string where a space follows a punctuation mark). Once we use this search pattern, we can pull out those matches or manipulate them in some way.

An example of a real-world use for regex would be validating an email address is entered correctly — it has a string of before an @ sign, and a string before a .com (or some variation) — but there are an infinite number of ways you may find it useful in your own code!

Regex Decoded

First things first, regex code goes between two forward slashes to differentiate it from the rest of your code:

/ *fancy code stuff* /

Now, the interesting part — there are a few major types of regular expressions that can go inside these slashes:

Anchors

^     Start of a line              $     End of a line

\A    Start of a string            \z    End of a string
$     End of a string, or line
\b    Any word boundary            
\<    Start of a word              \>    End of a word

Anchors tell the search where to start or stop. For example:

/\A *fancy code stuff* /

says “start searching for ___ (whatever you put next in the regex) at the start of every string.”

Notice the clever way \A (the first letter of the alphabet) denotes the start of a string, and \z (the last letter) denotes the end of a string

2. Groups and ranges

Groups and ranges can include numbers or letters, and tell the search what characters you’re looking to match:

[abc]         Any single character (a, b or c)

[^abc]        Excluding any single character (a, b , or c)
[a-x]         Any lowercase character between a-x
[A-T]         Any uppercase character between A-T
[a-zA-Z]      Any character between a-z or A-Z
[0-7]         Any number from 0 through 7
(a|b)         A or b (but not both)

So if we had:

/ \<[a-m] /

…the search is saying: “go to the start of every word, and find those that are any letter between lowercase a and m”.

#regex #ruby-on-rails #programming #ruby #developer

Regex Decoded

medium.com

Regex in Ruby - The Beginner's Guide