Home > Journal > Parsing Expression Grammars and Left Recursion

Parsing Expression Grammars and Left Recursion

July 25th, 2008 Gordon

Parsing Expression Grammars are a form of context-free grammars with ordered choice. This reduces ambiguity in the grammar and makes writing simple recursive-descent parsers (and parser generators) quite easy. With memoization, you can parse in linear time. The resulting parser is called a Packrat Parser.

The only problem is handling left-recursion. If you have grammar rules like this:

Expression = Expression "+" Term
           | Term

your recursive-descent parser will recurse until it runs out of stack.

I just noticed a paper on detecting and handling left-recursion: Packrat Parsers can Handle Left Recursion. It’s a nifty technique, but it somewhat reduces the simplicity of the resulting program.

For a finger exercise a while ago I wrote a pretty basic packrat parser generator and used it to implement a C preprocessor. Source is here. When my supply of circular tuits increases I’ll have to look at implementing the stuff from the paper.

  1. July 26th, 2008 at 03:46 | #1

    For what it’s worth, the way I proposed handling left-recursion has a much simpler runtime. Instead, more of the work is done upfront with the grammar.

    https://lists.csail.mit.edu/pipermail/peg/2007-September/000157.html
    https://lists.csail.mit.edu/pipermail/peg/2007-September/000160.html

  2. July 26th, 2008 at 08:04 | #2

    Hmm, I’ll have to take a look.

  3. Carolyn
    August 1st, 2008 at 16:20 | #3

    I didn’t understand more than one word in four of that. Good thing I’m not a linguist.

  4. August 1st, 2008 at 16:41 | #4

    Actually, it’s computer science, not linguistics :-) Computers have grammar too!

  5. Carolyn
    August 4th, 2008 at 09:08 | #5

    Ah! No wonder the computer gods don’t like me, when I can’t even understand their language :).

Comments are closed.