# Regular Expressions Tutorial

For many programmers the *regex* is some sort of magical sword that they throw to solve any kind of text parsing situation. But this tool is nothing magical, and even though it’s great at what it does, it’s not a full featured programming language (*i.e.* it is **not** Turing-complete).

# What does ‘regular expression’ mean?

*Regular expressions* express a language defined by a *regular grammar* that can be solved by a *nondeterministic finite automaton* (NFA), where matching is represented by the states.

A *regular grammar* is the most simple grammar as expressed by the *Chomsky Hierarchy*.

Simply said, a regular language is visually expressed by what an NFA can express, and here’s a very simple example of NFA:

And the *Regular Expression* language is a textual representation of such an automaton. That last example is expressed by the following regex:

```
^[01]*1$
```

Which is matching any string beginning with `0`

or `1`

, repeating 0 or more times, that ends with a `1`

. In other words, it’s a regex to match odd numbers from their binary representation.

# Are all regex actually a *regular* grammar?

Actually they are not. Many regex engines have improved and are using *push-down automata*, that can stack up, and pop down information as it is running. Those automata define what’s called *context-free* grammars in Chomsky’s Hierarchy. The most typical use of those in non-regular *regex*, is the use of a recursive pattern for parenthesis matching.

A recursive regex like the following (that matches parenthesis) is an example of such an implementation:

```
{((?>[^\(\)]+|(?R))*)}
```

(this example does not work with python’s `re`

engine, but with the `regex`

engine, or with the PCRE engine).

# Resources

For more information on the theory behind Regular Expressions, you can refer to the following courses made available by MIT:

- Automata, Computability, and Complexity
- Regular Expressions & Grammars
- Specifying Languages with Regular Expressions and Context-Free Grammars

When you’re writing or debugging a complex regex, there are online tools that can help visualize regexes as automatons, like the debuggex site.