Intro to Regular Expressions with Useful Examples

Released Friday, 10th July 2015

Good episode? Give it some love!

Intro to Regular Expressions with Useful Examples

Friday, 10th July 2015

Good episode? Give it some love!

Rate Episode

Ever needed to clean up some HTML content, Grab the text between a tag, or find all the email addresses in a document? Well Regular Expressions are here to the rescue.

You are watching Web Snacks and I’m John Harbison

Regular expressions or sometimes called regex is a sequence of characters used to pattern match. It is commonly used is many programming languages like Perl, PHP, javascript, Ruby and many others – but today we are using them in a text editor – I will be using Sublime Text for my examples

You can find a full list of editors that use Regex on Wikipedia
https://en.wikipedia.org/wiki/List_of_regular_expression_software

Why would you use Regular Expressions?
Simply because it saves time
Sure you could go through a document and click and copy and edit, but it is so much simpler to just have the computer work for you. Instead of manually copying email addresses in the document you could grab all of them at once. Instead of formatting the document to insert line breaks you can grab all of them programatically. Regex lets the computer work for you.

To really understand regex you need to understand the syntax – Today though I want to give you some simple pattern matches that are extremely useful and we are going to use Sublime Text to see them in real time.

For an example I’ve got a simple HTML document that has all of the lines joined. You might see some nasty html like this if you open someone else’s code or a site has been minified to eliminate the white space. You may need to edit it so being able to make it more human readable can be a time saver.

So I’ve opened the document in Sublime Text. I hit Command F on Mac or Cntrl F on Windows to open the Find Dialogue. On the left of the Find bar there are some buttons that I need to select. The first is the first button, the .* button that means we are searching with Regular expressions. The next two buttons I need to select are the “Wrap” button which allows me to search the entire document at once instead of just searching below or above my cursor. The last button I need to turn on is the highlight matches button which is directly to the left of the search box.

Now I’m ready to work

the first thing I want to do is split my paragraphs of content. In the search field I will type

which will select all of the closing paragraph tags.
As I type in

You’ll see that they all become highlighted. I’ll then Hit the “Find all” button which will select all instances of the closing paragraph tag. Hit the forward arrow on the keyboard which will move the cursor to the end of the select tags. Hit “enter” twice to insert line breaks in between the paragraphs. This makes the wall of text much more readable.

We are going to continue making the markup readable by putting all of the list items on separate lines. Command F to pull up the find dialogue. Type in the ending list item tag

Click Find all then hit the forward arrow then enter. Just like we did with the paragraphs, now all of our list items will be easier to read. We can finish up by moving the opening unordered list tags to their own lines as well. Also in Sublime Text you can select multiple lines and hit the tab key to indent lines.

For these two examples all we did was a straight tag match. Regex will pattern match anything you type. There are some scenarios where you might have to add slashes to get them to match, but for the most part if you type it, it will match. For this reason you may have actually used a regular expression matching system in the past and didn’t even know it.

But lets do some other things that might be a little more complicated.

What if we wanted to make a collection of links. So for this I’m going to select all of the anchor tags, complete with their linked content and put them all in a list at the bottom of the page.

Lets start by adding a header. I’ll add an