Changes Log

Moony Parser 2.4

- bug fixes

- made cosmetic changes to match Structured BNF language: curly braces replaced by round braces; substraction "\" operator replaced by not-matching operator "!"; keyword "abstract" is now pronounced "tag"

- implemented matching operator "&"

Moony Parser 2.3

- bug fixes

Moony Parser 2.2

- implemented “@AnyCharacter” variable

- implemented substraction operator “\”

- building grammar with javascript functions is marked obsolete. Preferred method is passing grammar string to compile function

- changed returned parsed tree internal structure for memory and performance issues

- fixed recursive function call stack overflow error for big texts. Only memory is a limit now

- cool grammar tester environment included in download bundle

Moony Parser 2.1

- bug fixes

Moony Parser 2.0

- implemented Earley parsing algorithm

Moony Parser 1.0

- shift-reduce algorithm implemented

1. Introduction

Parsing texts is something that every programmer encounters sooner or later. Word “parsing” in programming world denotes extraction of “abstract syntax tree” from textual input (string or file). For textual input that has some internal structure, we can use parsers to arrange fragments of that textual input over resulting syntax tree. Once built from input text, syntax tree is giving us an easy access to each syntax fragment that builds input text.

For example, we can have a date stored inside string. If we want to extract day, month and year separately from string like this: “20/10/2013”, we can make use of following grammar:

Date (

Day {@Number},


Month {@Number},


Year {@Number}


After parsing with this grammar, abstract syntax tree for string “20/10/2013” would have this structure:




     |   |    |    |   |

    Day “/” Month “/” Year

     |        |        |

   “20”     “10”     “2013”

Although usual date parsing would be done with three simple substring functions, we used the date example for a sake of explanation simplicity. In the same way that date is parsed, we can parse any kind of text, assuming that we provide correct grammar. These grammars can include boolean comparison definition, math expression definition or even  definitions for entire programming languages.

Moony Parser is a library tool that can parse texts and build abstract syntax trees by any provided grammar.

2. About Moony Parser

Moony Parser is a type of “Earley” parser written in javascript. What is interesting with Moony parser is that it uses its own minimalistic language for defining grammars, named "Structured BNF". Structured BNF is context free grammar complete, although it differs from context free grammar language. The main difference is structured representation of grammars, which makes it more readable than i.e. regular BNF’s or PEG’s grammars. Otherwise, Structured BNF language could be considered as a variant of context free grammar language.

3. Moony Grammar

Grammars in Moony parser are built from blocks that denote sequences (represented as tuples), choices (options determined at parsing time), variables, constants and regular expressions.

Tuples are building blocks that take any number of attributes which all have to exist in a sequence at parsing time. Tuples are defined inside round braces, preceding with optional tuple name. Attributes of tuple are delimited by a comma. A simple tuple named “Person” could have following structure:

Person (

        Name (@Word),


        Surname (@Word)


This grammar succeeds at parsing strings like “John Doe”, but not “John Doe Smith”.

Choices represent building blocks of which any, but only one element should be picked at parsing time. Choices are also defined within round braces, preceding with optional choice name. Elements of a choice are delimited by sign “|”. As an example of choice structure we can define a choice named “Vehicle”:

Vehicle ("car" | "ship" | "plane")

In parsing time this grammar will succeed weather parsed string says “car”, “ship” or “plane”.

Another building blocks are variables. Variables are denoted with preceding “@” sign. They can represent built in variables like:












In our first example we used built in variables @Word and @Whitespace. A variable also can refer to existing tuple or choice, like in following example:

Person (

        Name (@Word),

        Next ((@WhiteSpace, @Person) | @Null)


In this example, besides built in variables we have also used a variable @Person that refers recursively to a tuple Person. Grammar in previous example can parse a name followed by any number of middle names and surnames. Variables can also make use of dots, giving us a way to denote a path to the final variable, in which case a variable “@Person.Name” would refer to its respective content.

Constants in Moony grammar are denoted inside single or double quotes. If a quote is a part of a constant, we can escape it by preceding “\” sign (like "a quote inside const: \""). Example:

VariableDeclaration (

        Type (

                "string" |

                'int' |




        Name (@Variable),

        EOL (';')


This grammar can parse strings like “string name;” or “int num1;”.

Also, grammars can include regular expressions that are surrounded with “/” signs. Regular expressions can be followed by modifyers “i”, “m”, or both to denote “ignore case” and/or “multiline” respectively. An example would be:

AnyCaseWords (

Word (/[a-z]+/i),

Next ((@WhiteSpace, @AnyCaseWords) | "!")


This grammar can parse strings like “U can PARSE mE!”.

4. Keyword “tag”

Moony grammar also  accepts a keyword “tag” that precedes tuple or choice notation. The keyword  “tag” denotes that its respective tuple or choice would not be parsed at the spot where they reside. Yet, its tuple or choice can be parsed only at places where they are indirectly referenced through “@” variable. Example:

TagTest (

        tag Str {"parse me there, not here"} |

        Test1 ("now ", @TagTest.Str) |

        Test2 ("again ", @TagTest.Str)


Previous example parses strings “now parse me there, not here”, “again parse me there, not here”, but not “parse me there, not here”. Tags are ought to be used when multiple references to the same structure would be required, to prevent repeating code. You can also make a tag of an attribute of a tuple, in which case that attribute is ignored at given position.

5. Left recursion

Moony Parser, being of “Earley” type, has no problems with parsing left recursive grammars. Left recursion can be genious used to produce interesting results with operator precedence:

Sum (

         Fact (

            Exp (

               (@WhiteSpace | @Null),

               ('-' | @Null),

               Value (

                          Num (@Number) |

                          Var (@Variable) |

                          Bra (Left ('('), In (@Sum), Right (')'))


               (@WhiteSpace | @Null)

            ) |

MulDiv (Left (@Fact), In ('*' | '/'), Right (@Exp))

         } |

         AddSub (Left (@Sum), In ('+' | '-'), Right (@Fact))


Grammar in previous example happily respects all rules of operator precedence in math and can correctly parse strings like “1 + 2 * a - (b + c) / (d - e)”.

6. Additional matchings

Additional matchings are used when we want to avoid parsing of specific structures or we want to additionally match parsed structure with another expression. For example if we want to build a quoted string parser, we want a quote to be excluded it from string value while interpreting it as a string terminator. Not-match excludes expressions from parsing and is noted after "!" operator like in following example:

Str (


Value (

                Char (@AnyCharacter) ! '"',

        Next (@Value | @Null)




Not-match can take any valid grammar expression as an argument (a constant in our example). And-match is rarely used and it is denoted by "&" operator. With and-match an expression is parsed only if it also matches and-match argument.

Not-match and and-match can be bixed in any amount after any parsing structure.

7. Programming interface

Grammar is defined in string format and it should be firstly compiled before parsing any text. Once when compile succeeds, we can pass “compiled” property of a result to parsing function that actually parses text. After parsing a text against compiled grammar rules, a syntax tree is returned.

Regular call to compiler and parser is done with “parser.compileSync” and “parser.parseSync” functions:

// this is synchronous call to compiler and parser

var rules = parser.compileSync (grammarString);

if (rules.errors.length === 0) {

var tree = parser.parseSync (rules.compiled, stringToParse);

if (tree.errors.length === 0) {

// to do something with tree.parsed

} else {

alert ("Text error count: " + tree.errors.length);


} else {

        alert ("Grammar error count: " + rules.errors.length);


Is some cases, when grammars are complicated or string to parse is very large, parsing could take a few or more seconds to complete. In those cases we might want to use a parsing call that returns control to a program right away after invoking parser, but triggers completing function after whole text is parsed. This kind of behavior is called asynchronous call and it can be done by functions “parser.compileSync” and “parser.parseAsync”:

// this is asynchronous call to parser

parser.compileAsync (grammarString,

function (rules) {

if (rules.errors.length > 0) {

alert ("Grammar error count: " + tree.errors.length);

} else {

parser.parseAsync (rules.compiled, textToParse,

function (tree) {

if (tree.errors.length > 0) {

alert ("Text error count: " + tree.errors.length);

} else {

                        // to do something with tree.parsed



function (progress) {

progresPercent = Math.round(progress * 100) + "%";

return true;





function (progress) {

progressPercent = Math.round (progress * 100) + "%";

return true;



As we can see, “parser.compileAsync” function has following parameters:

- grammar: string where a grammar is stored in textual format,

- done function: this function is called when parsing grammar is done. It takes compiled grammar or errors as parameter

- progress function:  this function is repeatedly called while parsing executes. If it returns false, parsing breaks

“parser.parseAsync” function has following parameters:

- rules: top rule

- text: string to parse by rules

- done function: this function is called when parsing text is done. It takes syntax tree or errors as a parameter

- progress function:  this function is repeatedly called while parsing executes. If it returns false, parsing breaks

8. Using parsed tree

Parsed tree returned from “parseSync” or passed by “parse” function has following structure:

- if a node is a token:

- “value” property holds a string as a fragment of parsed text

- if a node is a choice:

- “value” property holds a parsed sub-tree

- if a node is a tuple:

- “value” property contains an array of tuple attributes’ sub-trees

- in all cases

- “parsedBy” property holds a rule by which a node is parsed, where:

- “name” property holds a name of the rule

- “type” property holds one of strings: “token”, “tuple” or “choice”

- properties “from” and “to” hold integer positions of parsed node inside parsed string

A code that alerts whole syntax tree structure after parsing would look like following:

function pTree (tree) {

var tmp;

if (tree.parsedBy.type === "choice") {

alert (

pTree (tree.value);

} else if (tree.parsedBy.type === "tuple") {

alert (

for (var i = 0; i < tree.value.length; i++) {

        pTree (tree.value[i]);


} else if (tree.parsedBy.type === "token") {

alert (tree.value);



pTree (tree.parsed);

9. Errors

If there were any errors during parsing grammar or text, result’s property “errors” holds an array of objects structured this way:

tree.errors[x] = {  




description: ...string...


For parsing grammars, multiple errors can be reported (like “Element not found at...” with reference variables or “Root node can not be abstract”). For parsing texts only one error can be reported at the same time (that is “Syntax error at…”).

10. Conclusion

Moony Parser is an “Earley” parser which makes it context free grammar complete. It has no problems with nullable grammars, at least none I’ve noticed. I guess it will find uses in parsing texts like database search engine queries or parsing textual file formats through AJAX calls. Basically, for parsing large files only amount of memory is the limit. 2 gigs of ram enables parsing of few thousands lines of text and bigger files tend to use virtual memory which degrades performance.

The parser is relatively fast (half line per millisecond on my machine) and the code is free for any use, modified or non-modified by third party. To use the code, download bundled zip from this site and include “parser.js” file in your HTML. There is a HTML grammar testing environment included in download bundle. If you'd like to support the free world, please consider donating.

Good luck!