REGEX

Recommended Readings

What is regular expression?

A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. — Wikipedia It works similar to when we are searching something on Google but in a more advanced and specific way.

Why it matters?

  • Remove human errors when it come to countless of data sorting or wrangling

  • Save you time and effort - once you written it once, it's reuable

The basics

Character class

[]

Cookbook

//1st word including hyphen eg.Bethune-Cookman
^\w+\b[-]{0,1}(\w+)?
1st 2 words with/without "'s"
^\w+\s\w+[\']?s?
//2 words only
^\w+\s\w+$
// searching ???
.+\(.+\)
// Last word in parenthesis
\(([^)]*)\)[^(]*$

HEX codes

[A-Fa-f0-9]{6} // single
// HEX codes from "Primary: 0050A3 Secondary: FFFFFF"
\s[a-zA-Z0-9]{6,}

NCAA

// team names
(\w*Alt\w*){0,1}((-\s)?\w*ALT\w*){0,1}","")
// breaking
^\w+[-]?\w // catch hyphenated compound
^\w+\b[-]{0,1}[\w+]? // break two words
^\w+[']?\w?\s\w+[\']?s?" // break words with apostrophe
\w+\s\w+$ // capture last 2 words
\w+$ // capture last word
// optimized
^[\w's&-.]+[ &]?[\w's&.]+
// workflow
// 1. break school with State
^[\w's&-.]+[ &]?[\w's&.]+ [State]+
^[\w's&-.]+[ &]?[\w's&.]+ [StateUniversity]+