A Meta-Grammar for PEG Parsers

Guido van Rossum
Sep 9 · 9 min read
start: rules ENDMARKER
rules: rule rules | rule
rule: NAME ":" alts NEWLINE
alts: alt "|" alts | alt
alt: items
items: item items | item
item: NAME | STRING
item: NAME { name.string } | STRING { string.string }
items: item items { [item] + items } | item { [item] }
alt: items { Alt(items) }
@subheader "from grammar import Rule, Alt"
@subheader """
from token import OP
from grammar import Rule, Alt
"""
start: metas rules ENDMARKER | rules ENDMARKER
metas: meta metas | meta
meta: "@" NAME STRING NEWLINE
start: metas rules ENDMARKER { Grammar(rules, metas) }
| rules ENDMARKER { Grammar(rules, []) }
metas: meta metas { [meta] + metas }
| meta { [meta] }
meta: "@" NAME STRING { (name.string, eval(string.string)) }
alt: items action { Alt(items, action) }
| items { Alt(items, None) }
action: "{" stuffs "}" { stuffs }
stuffs: stuff stuffs { stuff + " " + stuffs }
| stuff { stuff }
stuff: "{" stuffs "}" { "{" + stuffs + "}" }
| NAME { name.string }
| NUMBER { number.string }
| STRING { string.string }
| OP { None if op.string in ("{", "}") else op.string }
@subheader """
from grammar import Grammar, Rule, Alt
from token import OP
"""
start: metas rules ENDMARKER { Grammar(rules, metas) }
| rules ENDMARKER { Grammar(rules, []) }
metas: meta metas { [meta] + metas }
| meta { [meta] }
meta: "@" NAME STRING NEWLINE { (name.string, eval(string.string)) }
rules: rule rules { [rule] + rules }
| rule { [rule] }
rule: NAME ":" alts NEWLINE { Rule(name.string, alts) }
alts: alt "|" alts { [alt] + alts }
| alt { [alt] }
alt: items action { Alt(items, action) }
| items { Alt(items, None) }
items: item items { [item] + items }
| item { [item] }
item: NAME { name.string }
| STRING { string.string }
action: "{" stuffs "}" { stuffs }stuffs: stuff stuffs { stuff + " " + stuffs }
| stuff { stuff }
stuff: "{" stuffs "}" { "{" + stuffs + "}" }
| NAME { name.string }
| NUMBER { number.string }
| STRING { string.string }
| OP { None if op.string in ("{", "}") else op.string }
def peek_token(self):
if self.pos == len(self.tokens):
while True:
token = next(self.tokengen)
if token.type in (NL, COMMENT):
continue

break
self.tokens.append(token)
self.report()
return self.tokens[self.pos]
start: metas rules ENDMARKER { Grammar(rules, metas) }
| rules ENDMARKER { Grammar(rules, []) }
$ python -m tokenize
foo bar
baz
dah
dum
^D
NAME     'foo'
NAME 'bar'
NEWLINE
INDENT
NAME 'baz'
NEWLINE
NAME 'dah'
NEWLINE
DEDENT
NAME 'dum'
NEWLINE
rule: NAME ":" alts NEWLINE INDENT more_alts DEDENT {
Rule(name.string, alts + more_alts) }
| NAME ":" alts NEWLINE { Rule(name.string, alts) }
| NAME ":" NEWLINE INDENT more_alts DEDENT {
Rule(name.string, more_alts) }
more_alts: "|" alts NEWLINE more_alts { alts + more_alts }
| "|" alts NEWLINE { alts }
start:
| metas rules ENDMARKER { Grammar(rules, metas) }
| rules ENDMARKER { Grammar(rules, []) }

Guido van Rossum

Written by

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade