“Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.” – Jamie Zawinski
Every now and then I have some requirements where it entails parsing some data where I need to use some form of Regular Expression(regex), it is a syntax to use to search for patterns in a string or sets of strings. At first, the syntax looks intimidating and most people would shy away and resort to writing some functions to solve the issue.
Learning regex is a must-have skill to have as it can be applied to a wide range of tasks that needs some sort of search or parsing and is widely available in most programming languages, works on shells, text editors and IDEs, etc..). I use it most of the time when writing front end Javascript validation and backend logic in Apex NodeJS, Java/Groovy , Swift and Python
I created this cheat sheet that covers the basics and some handy tips.
Flags
The search pattern is normally delimited by two slash characters /abc/. At the end we can specify a combination of the following flags.
- g (global) – Don’t return after the first match
- m (multi-line) – ^ and $ match start/end of line
- i (insensitive) case insensitive match
- x (extended) ignore whitespace
- X (eXtra) disallow meaningless escape
- s (single line) dot matches new line
- u (unicode) match with full unicode
- U (Ungreedy) make quantifiers lazy
- A (Anchored) anchor to start of pattern
- J (Jchanged) allow duplicate subpattern names
- D (Dollar end only) $ matches only end pattern
Anchors
Quantifiers
Or and Brackets
When inside bracket expressions all special character rules do not apply. E.g. \ to escape a character does not apply
Character Classes
In order to be taken literally, you must escape the characters ^.[$()|*+?{\
with a backslash \
as they have special meaning.
Groupings
Greedy and Lazy Quantifiers
The quantifiers ( * + {}
) are greedy operators, so they expand the match as far as they can through the provided text.
Boundaries
Back References – \1
Look-ahead and Look-behind
Top Regular Expressions
Summary
As we’ve learned regex is so powerful and its’s application is wide. Listed below some of few things you can do with regex within your project.
- input and data validation –
- validate user input in forms
- validate data before applying logic or saving to database
- validating JSON schema
- replacing values – replace specific data in a string
- text parsing – eg. retrieve only bits of data from a string or URL or delimiters
- string replacement – eg on some IDE you can find and replace a string, use regex to search for particular patterns
- web scraping – look for specific patterns for data to scrape
Sample Codes
Apex class that implements removal of white space not found in quotes
Python script that crawls pages that matches the pattern.
And finally, as a takeaway just learn the syntax and hack away.