ART — Basic introduction

James Tapsell
Jrtapsell
Published in
2 min readMay 6, 2017

An introduction to the ART parser generator

For my summer UROP I created an eclipse plugin that worked with the ART parser generator, this post is a basic introduction to ART.

Unless you are using ART this will probably not be very helpful, but some of the basics should apply across similar systems.

Syntax

ART grammars have quite a complex syntax (Which itself is defined in ART), but at the most basic level it is similar to BNF.

Grammars look something like this:

a ::= b c
b ::= ‘hello’
c ::= ‘world’

This grammar parses the string

hello world

Terminals

There are 3 types of character terminals in ART.

  1. Single quoted strings
    These were used in the previous example. They are case sensitive, so the terminal 'hello' matches the string hello , but not the string Hello
  2. Double quoted strings
    These use double quotes around them, and they are not case sensitive
  3. Backtick characters
    These are used for defining terminals that would be difficult to define otherwise, it applies to the next character only.
    It also has the effect of disabling whitespace managerment, so while
    a ::= 'b' 'c' will accept b c , a ::= `b `cwill not

Built-ins

Some elements of grammars are common, for example integer representations normally match the regex [0-9]+ .

Repeating this in grammars would be very repetitive, and also less expressive, so ART has some built-ins, all of which begin with an ampersand, the most commonly used ones of these are:

  • &ID
    Any valid Java identifier
  • &INTEGER
    Any integer
  • &STRING_DQ
    Any string enclosed in double quotes
  • &STRING_SQ
    Any string enclosed in single quotes
  • &STRING_BRACE_NEST
    Any string enclosed in braces

If you wanted a grammar that accepted Python style assignments of integers, you could define it like this:

statement ::= &ID '=' &INTEGER

which would accept

a = 8

Alternation and Repetition

Sometimes in a grammar you may want to have multiple options, an example of this is a simple array of booleans in JSON, where each element can be true or false , in ART this can be implemented as:

boolean ::= 'true' | 'false'

Repetition can be achieved by saying that an item contains something or itself and something

booleans ::= boolean | booleans boolean
boolean ::= 'true' | 'false'

This would accept

true false true false

Delimiters

In arrays you may want a delimiter, this can be added like so:

booleans ::= boolean | booleans ',' boolean
boolean ::= 'true' | 'false'

Optional items

To make an item optional you can alternate it with epsilon, written as # :

optional_value ::= value | #

Comments

There are 2 forms of comments in ART:

  • Double slash
    // Single line comments
  • Bracket star
    (* Continues until closed *)

--

--