An improved fork of LIME, an LALR(1) parser generator written in PHP. The original source code can be found at
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Vitaliy Filippov 52918751c4 Actually return false from eat() on error for reduce part 8 years ago
examples Moved passing of errors to the parser itself 10 years ago
.gitignore Added .gitignore 10 years ago Typo 10 years ago
flex_token_stream.php Cleaning up the code 10 years ago
lemon.c.gz Gzipped lemon.c, since I do not want github to think this is a C project 10 years ago
lime.bootstrap Allow to use multi-character tokens 8 years ago
lime.php Compress reduce rules 8 years ago
lime_scan_tokens Allow to use multi-character tokens 8 years ago
lime_scan_tokens.l Allow to use multi-character tokens 8 years ago
metagrammar Initial commit 10 years ago
parse_engine.php Actually return false from eat() on error for reduce part 8 years ago Cleaning up the code 10 years ago

Lime: An LALR(1) parser generator in and for PHP.

Interpreter pattern got you down? Time to use a real parser? Welcome to Lime.

If you're familiar with BISON or YACC, you may want to read the metagrammar. It's written in the Lime input language, so you'll get a head-start on understanding how to use Lime.

  1. If you're not running Linux on an IA32 box, then you will have to rebuild lime_scan_tokens for your system. It should be enough to erase it, and then type CFLAGS=-O2 make lime_scan_tokens at the bash prompt.

  2. Stare at the file lime/metagrammar to understand the syntax. You're seeing slightly modified and tweaked Backus-Naur forms. The main differences are that you get to name your components, instead of refering to them by numbers the way that BISON demands. This idea was stolen from the C-based "Lemon" parser from which Lime derives its name. Incidentally, the author of Lemon disclaimed copyright, so you get a copy of the C code that taught me LALR(1) parsing better than any book, despite the obvious difficulties in understanding it. Oh, and one other thing: symbols are terminal if the scanner feeds them to the parser. They are non-terminal if they appear on the left side of a production rule. Lime names semantic categories using strings instead of the numbers that BISON-based parsers use, so you don't have to declare any list of terminal symbols anywhere.

  3. Look at the file lime/lime.php to see what pragmas are defined. To be more specific, you might look at the method lime::pragma(), which at the time of this writing, supports "%left", "%right", "%nonassoc", "%start", and "%class". The first three are for operator precedence. The last two declare the start symbol and the name of a PHP class to generate which will hold all the bottom-up parsing tables.

  4. Write a grammar file.

  5. /path/to/lime/lime.php list-of-grammar-files > my_parser.php

  6. Read the function parse_lime_grammar() in lime.php to understand how to integrate your parser into your program.

  7. Integrate your parser as follows:

     require 'lime/parse_engine.php';
     require 'my_parser.php';
     // Later:
     $parser = new parse_engine(new my_parser());
     // And still later:
     try {
         while (..something..) {
             $parser->eat($type, $val);
             // You figure out how to get the parameters.
         // And after the last token has been eaten:
     } catch (parse_error $e) {
     return $parser->semantic;
  8. You now have the computed semantic value of whatever you parsed. Add salt and pepper to taste, and serve.