Intro to Regular Expressions: Software Engineering Full time 13 Phase 3 Hybrid

Close

Learning Goals


Key Vocab


Introduction

Say you're working at your new job as a developer and your supervisor asks you to build in validation for the email field in the company's signup form. There have recently been a lot of sign-ups with invalid email addresses (e.g., "joeflatiron.com", "@helloworld.com", and "$%adam@gmail.com "). First, you sit down and come up with a set of rules that any email address should adhere to (stop reading and see how many you can come up with):

We now have a pattern that we know all email addresses must follow. We use regular expressions (usually shortened to regex) to encode these patterns for matching, searching, and substitution. Here's a sample regular expression for email validation:

r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

If this doesn't make any sense, don't worry. We'll be covering how to write and read regular expressions shortly.

(There are actually a LOT more rules Links to an external site. for email addresses, but you get the point.)

History

RegEx came about in the 1950's and 1960's in various forms. Among the first appearances of regular expressions in program form was when Ken Thompson built Stephen Cole Kleene Links to an external site.'s notation into the editor QED as a means to match patterns in text files. Since then, there have been various implementations of regular expressions developed.

We'll be using Python regular expressions, an implementation mostly based off the PERL language. A key difference between the two is that regex in Python requires us to import the re module, where regex is natively supported in PERL and many other languages.

When to use RegEx

Regular expressions are an extremely powerful way to search through strings and blocks of text for specific patterns. They can be used for data validation, searching, mass file renaming, and finding records in a database. Use them carefully. They are like a surgeon's scalpel: able to do a lot of harm or good, depending on how skillfully they are wielded.


Resources