The standard username format is numbers, lowercase letters, – and symbols _. Reasonable length – from 3 to 16 characters. Depending on your specific needs, you can change the character set (for example, allow the character *) and the length of the string.
/^[a-z0-9_-]{3,16}$/
J
Js:
re.test('normal_login-123'); // true
re.test('IncorrectLogin'); // false
re.test('inc*rrect_l*gin'); // false
What we use:
Symbols ^and $indicate the beginning and end of the line, so that the entered username will be checked to match completely from the first to the last character.
Checking email addresses for correctness is one of the most frequent tasks of a web developer. Neither various subscription forms nor authorization can do without this.
For validation email there are many different regulars. Here is one of them – not the biggest and not the most difficult, but accurate enough to quickly verify the address
/^[A-Z0-9._%+-]+@[A-Z0-9-]+.+.[A-Z]{2,4}$/i
Js:
re.test('correct-email@mail.com'); // true
re.test('CORRECT.email@mail123.com'); //true
re.test('incorrect-email@mail'); //false
What we use: Regular expression
flag I provide a case-insensitive comparison.
When checking the phone number, be sure to take into account the generally accepted formats, since in different countries they are usually recorded in different ways. For example, for the American style, this regular match would be suitable:
/^\+?(\d{1,3})?[- .]?\(?(?:\d{2,3})\)?[- .]?\d\d\d[- .]?\d\d\d\d$/
Js:
re.test('(212) 348-2626'); // true
re.test('+1 832-393-1000'); // true
re.test('+1 202-456-11-11'); // false
What we use:
Quantifier ? corresponds to one previous character or its absence.
Often you meet on various services the requirement to come up with a complex password? Who and how determines the required degree of complexity? In fact, there are some standards for this: minimum length, different case of characters, the presence of letters, numbers and special characters.
To provide your users with strong passwords, you can use this expression here (or create your own regulars with specific requirements):
/^(?=.*[A-Z].*[A-Z])(?=.*[!@#$&*])(?=.*[0-9].*[0-9])(?=.*[a-z].*[a-z].*[a-z]).{8,}$/
Js:
re.test('qwerty'); // false
re.test(qwertyuiop'); // false
re.test('abcABC123$'); // true
What we use:
The operator ?=inside the bracket group allows you to search for matches by “looking ahead” the transferred string and not include the found fragment in the resulting array.
The format of the zip code, like the phone, depends on the specific state.
In Russia, everything is simple: six digits in a row without separators.
/^\d{6}$/
American zip-code can consist of 5 characters or in the extended ZIP + 4 format – from 9.
/^\d{5}(?:[-\s]\d{4})?$/
Js:
re.test(‘75457’); // true
re.test(‘98765-4321’); // true
What we use:
The sequence ?: inside the bracket group excludes it from memorization.
Of course, when checking a payment card number, you should not rely on regular expressions. However, with their help, you can immediately weed out obviously inappropriate sequences and do not overload the server with an extra request.
With the help of such a long regular season, you can support several payment systems at once:
/^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$/
You can find out more where everything came from here .
What we use :
Vertical bar |in regular expressions means alternation, that is, the choice of one of several options.
Spaces at the beginning and end of the line usually do not carry any semantic load, but can affect the analysis and data processing, so you should immediately get rid of them.
/^[ \s]+|[ \s]+$/g
Js:
let str = ” hello “;
console.log(str.length);
// 7str = str.replace(re, ”);
console.log(str.length); // 5
What we use:
Quantifier +corresponds to the instruction {1,}- one or more characters.
You have to work with dates very often, and they have a great many recording formats. Before starting processing, it makes sense to check whether the type of the transferred string matches the required one.
Here such a regular expression supports several date formats – with full and short numbers (5-1-91 and 05-01-1991) and different delimiters (period, forward or backward slash).
What we use:
Quantifier +corresponds to the instruction {1,}- one or more characters.
/^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2])\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)0?2\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9])|(?:1[0-2]))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$/
Even leap years are counted here!
Js:
re.test('29-02-2000'); // true
re.test('29-02-2001'); // false
What we use:
View sequences \1, \2 and so on – these are backward references to bracket groups that define the type of separator. Thanks to them, you can filter out dates with different delimiters:
re.test('10-10/2010'); // false
The IP address is used to identify a specific computer on the Internet. It consists of four groups of numbers (bytes) separated by dots (192.0.2.235).
/\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/
What we use:
Class \b means “word boundary” and has zero width (that is, it is not a separate character).
IPv6 is a new, more complex syntax of IP protocol. The expression to check for this format looks much worse, although in reality the difference lies only in the support of hexadecimal numbers:
(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))
Base64 is a fairly common coding format for binary data, which is often used, for example, in email newsletters.
To validate a string in this format, you can use the following regular expression:
^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
ISBN is the international nomenclature for printed books. The number can consist of 10 (ISBN-10) or 13 digits (ISBN-13). On the books themselves, ISBNs are usually divided into several groups by hyphens (country code, publisher code, and the book itself), but they should be deleted for testing and use.
This regular expression allows you to check both formats at once:
/\b(?:ISBN(?:: ?| ))?((?:97[89])?\d{9}[\dx])\b/i
Js:
re.test(‘ISBN 9781106998966’); // true
re.test(‘1106998966’); // true
re.test(‘110699896x’); // true
A very simple string check for a number with the help of regulars:
/^\d{1,}$/
Js:
re.test(’13’); // true
re.test(’23yy’); // false
The task of breaking a large number into digits of three digits is found in development quite often. It turns out it is very easy to do with the help of regulars.
/\d{1,3}(?=(\d{3})+(?!\d))/g
Js:
'1234567890'.replace(re, '$&,'); // 1,234,567,890
What we use:
The combination $& in the replacement string allows you to substitute the found combination.
Prices can be presented in many different formats. Most likely, there is no universal regular expression for them, but it is very easy to extract the dollar price from a string.
This regular schedule assumes that commas are used to separate digits of the number, and the fractional part is separated by a period:
/(\$[0-9,]+(\.[0-9]{2})?)/
Js:
let price = ‘price $5,555.55’.match(re)[0]; ‘$5,555.55
What we use:
The combination {2}means that the character from the range [0-9]must be repeated exactly 2 times (the fractional part of the price).
If you need to check whether the resulting string is a URL, you can use this regular routine:
/[-a-zA-Z0-9@:%_\+.~#?&\/=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&\/=]*)?/gi
It is suitable for addresses with different protocols (HTTP, HTTPS, FTP) and even without a protocol.
Js
re.test('https://yandex.ru'); // true
re.test('yandex.ru'); // true
re.test('hello world'); // false
The URL has many parts: protocol, domain, subdomains, page path, and query string. With the help of regulars, you can discard all unnecessary and get only the domain:
/https?:\/\/(?:[-\w]+\.)?([-\w]+)\.\w+(?:\.\w+)?\/?.*/i
Js:
let domain = 'https://proglib.io'.match(re);
console.log(domain[1]); // proglib
What we use:
The method match returns an object with match data. Under index 1, it contains a match corresponding to the first bracket group.
One line of the regular expression allows you to quickly and easily get the file extension with which you have to work:
/^(?:.*\.(?=(htm|html|class|js)$))?[^.]*$/i
Js:
let file1 = ‘script.js’.match(re)[1]; // js
let file2 = ‘hello’.match(re) [1]; // undefined
Of course, if necessary, you can add other extensions here.
Sometimes it is required to extract the protocol of the received link. Regular expressions make life easier here:
/^([a-zA-Z]+):\/\//
Js:
let protocol = 'https://proglib.io/'.match(re)[0]; // https
Twitter username:
/@([A-Za-z0-9_]{1,15})/
Facebook account URL:
/(?:http:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-]*)/
Js:
re.exec(‘https://www.youtube.com/watch?v=JbgvaQ_rM4I’)[1]; // JbgvaQ_rM4I
What we use:
The method exec of the regular expression object works almost the same as the matchstring method .
A web developer often has to deal with colors given in hexadecimal format. Regulars make it easy to extract such colors from the line:
/\#([a-fA-F]|[0-9]){3, 6}/
To get the address of the image is usually used DOM-method img.getAttribute(‘src’). Regulars are rarely used for this, but it is useful to know their capabilities:
/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/
Js:
re.exec(‘<img src=”image.png” alt=”image1″>’)[1]; // image.png
Another nontrivial situation is getting CSS properties using regular expressions:
/\s*[a-zA-Z\-]+\s*[:]{1}\s[a-zA-Z0-9\s.#]+[;]{1}/gm
Js:
let css = ` .element {
color: white;
background: black;
font-size: 16px;
}`
css.match(re);
What we use:
The flag min regular expressions includes a multi-line mode.
And this is a very useful tool for removing comments from HTML code:
/<!--(.*?)-->/
What do we use?
A character ? standing in a regular expression after another quantifier translates it into lazy mode .
You can get the title of a webpage using the following regular expression:
/<title>([^<>]*?)</title>/
An important SEO task that you really don’t want to do manually is the addition of an attribute to external links rel=”nofollow”. Let’s turn to regular expressions:
PHP:
$html = '<a href="https://site.com">site.com</a>,
<a href="my-site.com">my-site.com</a>,
<a href="https://site.com" rel="nofollow">site.com</a>';
$re = '/(<a\s*(?![^>]*\brel=)([^>]*\bhref=\"https?:\/\/[^"]+\"))/';
$result = preg_replace($re, '$1 rel="nofollow"', $html);
This regular program selects in the text all links with the http / https protocol without an attribute reland adds it.
If you want to analyze CSS media queries, use this regular schedule:
/@media([^{]+)\{([\s\S]+?})\s*}/g
What we use:
Class \s denotes a whitespace character (as well as tab and newline), and a class \S- on the contrary, any character except whitespace.
A useful expression for finding and highlighting words in text:
/\b(ipsum)\b/ig
Js:
let text = ‘Lorem ipsum dolor, lorem ipsum dolor.’;
text.replace(re, ‘<span style=”background: yellow”>$&</span>’)
PHP:
$re = ‘/\b(ipsum)\b/i’;
$text = ‘Lorem ipsum dolor, lorem ipsum dolor.’;
preg_replace($re, ‘<span style=”background:#5fc9f6″>1</span>’, $text);
Of course, the word ipsum can be replaced by any other word or phrase.
Fortunately, good old IE is gradually becoming a thing of the past, but it still plays a role in the modern web. This code snippet allows you to determine the version of your favorite browser:
/^.*MSIE [5-8](?:\.[0-9]+)?(?!.*Trident\/[5-9]\.0).*$/
Regulars make it possible to automatically remove random word repetitions without looking through the entire text:
/(\w+)\s+\1/gi
Js:
“hello world world hello”.replace(re, “$1”) // hello world hello
Sometimes a web developer needs to determine the number of words in a line, for example, to organize keywords in analytics tools. You can do this with the following regulars:
^[^\s]*$ //exactly one word
^[^\s]*\s[^\s]*$ // exactly two words
^[^\s]*\s[^\s]* // two words and more
^([^\s]*\s){2}[^\s]*$ // exactly three words^([^\s]*\s){4, }[^\s]*$ // five words and more
#web-development #javascript #regex