REGEX

What is regular expression?

A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. — Wikipedia It works similar to when we are searching something on Google but in a more advanced and specific way.

Why it matters?

  • Remove human errors when it come to countless of data sorting or wrangling

  • Save you time and effort - once you written it once, it's reuable

The basics

Character class

[]

Cookbook

//1st word including hyphen eg.Bethune-Cookman
^\w+\b[-]{0,1}(\w+)?


1st 2 words with/without "'s" 
^\w+\s\w+[\']?s?

//2 words only
^\w+\s\w+$ 

// searching ???
.+\(.+\)

// Last word in parenthesis
\(([^)]*)\)[^(]*$

HEX codes

[A-Fa-f0-9]{6} // single 


// HEX codes from "Primary: 0050A3 Secondary: FFFFFF" 
 \s[a-zA-Z0-9]{6,} 

NCAA

// team names 
(\w*Alt\w*){0,1}((-\s)?\w*ALT\w*){0,1}","")

// breaking 
^\w+[-]?\w // catch hyphenated compound
^\w+\b[-]{0,1}[\w+]? // break two words 
^\w+[']?\w?\s\w+[\']?s?" // break words with apostrophe 

\w+\s\w+$ // capture last 2 words 
\w+$ // capture last word 

// optimized 
^[\w's&-.]+[ &]?[\w's&.]+


// workflow 
// 1. break school with State 
^[\w's&-.]+[ &]?[\w's&.]+ [State]+
^[\w's&-.]+[ &]?[\w's&.]+ [StateUniversity]+

Last updated