What is regular expression?

A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. — Wikipedia It works similar to when we are searching something on Google but in a more advanced and specific way.

Why it matters?

  • Remove human errors when it come to countless of data sorting or wrangling
  • Save you time and effort - once you written it once, it's reuable

The basics

Character class



//1st word including hyphen eg.Bethune-Cookman
1st 2 words with/without "'s"
//2 words only
// searching ???
// Last word in parenthesis

HEX codes

[A-Fa-f0-9]{6} // single
// HEX codes from "Primary: 0050A3 Secondary: FFFFFF"


// team names
// breaking
^\w+[-]?\w // catch hyphenated compound
^\w+\b[-]{0,1}[\w+]? // break two words
^\w+[']?\w?\s\w+[\']?s?" // break words with apostrophe
\w+\s\w+$ // capture last 2 words
\w+$ // capture last word
// optimized
^[\w's&-.]+[ &]?[\w's&.]+
// workflow
// 1. break school with State
^[\w's&-.]+[ &]?[\w's&.]+ [State]+
^[\w's&-.]+[ &]?[\w's&.]+ [StateUniversity]+