SÃ©rgio Medeiros and colleagues have published a paper that fleshes out their previously unpublished algorithm that I used for IronMeta‘s left-recursion handling, along with a very kind mention of IronMeta itself.
Parsing Expression Grammars are a form of context-free grammars with ordered choice. This reduces ambiguity in the grammar and makes writing simple recursive-descent parsers (and parser generators) quite easy. With memoization, you can parse in linear time. The resulting parser is called a Packrat Parser.
The only problem is handling left-recursion. If you have grammar rules like this:
Expression = Expression "+" Term
your recursive-descent parser will recurse until it runs out of stack.
I just noticed a paper on detecting and handling left-recursion: Packrat Parsers can Handle Left Recursion. It’s a nifty technique, but it somewhat reduces the simplicity of the resulting program.
For a finger exercise a while ago I wrote a pretty basic packrat parser generator and used it to implement a C preprocessor. Source is here. When my supply of circular tuits increases I’ll have to look at implementing the stuff from the paper.