Learning the Language I Created

15 min readMar 1, 2017

I created the programming language Flabbergast. I also had to learn how to program in it. Having to learn how to use something I created is a peculiar feeling.

My first Flabbergast program wasn’t very good. Unfortunately, it’s also the Flabbergast compiler. Oops. I got better, just like when I used programming languages I didn’t invent.

History

When I worked at Google, I had to use a proprietary language called GCL. For NDA reasons, I will only repeat what is already a matter of public record in the partially-redacted GCL Viewer paper.

GCL was heavily used in my role and nearly everyone hated it. GCL was truly Frankenstein’s monster. It had some parts that seemed vaguely JavaScript, some very functional LISP-y bits, Java-ish object behaviour, peculiarly C-like syntax, and some ideas that really came out of the blue. In particular, the “contextual lookup” in Flabbergast came from GCL which allegedly came from natural language processing folks. The problem, familiar to any C++ programmer, is that, when different paradigms come together, confusing things happen at the boundaries. For instance, GCL was mostly the kind of declarative, side-effect-free, functional utopia that gets Haskell programmers off, but it had error handling that could cause non-deterministic behaviour.

JavaScript and C++ programmers have adapted quite well to their respective suck: they have patterns and conventions and rules of thumb to avoid and eliminate certain categories of errors and unintended effects. GCL lacked this for social reasons. The GCL documentation was not very helpful and bad code was written. When new programmers needed to something, their best resource was the corpus of bad code and everyone tried to write as little GCL as possible, so the half-way decent GCL was a smaller proportion of the code. This perpetuated a cycle where most of the examples of GCL were bad and hard to understand.

There were a few people, myself included, had the attitude of Douglas Crockford, author of JavaScript: The Good Parts: yes, GCL was a tire fire on the whole, but there were good ideas buried in there. The things that made GCL good were also really weird. They were features I’ve not found in any other language. I worked to create a manual of design patterns for GCL and other training materials. It was really rewarding when many GCL-haters started to see the value in GCL after they embraced the weird. I began to see GCL with a kind of pity.

That’s not a defence that GCL is good. It’s objectively bad…just not irredeemably bad. What made it powerful, expressive, and useful seemed to be weird–really weird. It’s easy to forgive the authors of GCL since they could not have anticipated the results of their decisions.

For instance, contextual lookup seemed to be part of the good but it is a form of dynamic scoping. Pick up a programming languages textbook and it might have some little anecdote about how dynamic scoping was one of those laughably bad ideas that has been dead since the 80s and only exists in a EmacsLisp and R for historical reasons. Somehow, contextual lookup really worked in GCL. It was not a burden or a mistake. It was a useful feature, despite the conventional wisdom.

With Flabbergast, I was going to embrace the weird. I wanted to know if my separation of good and bad was “correct” (i.e., useful). The idea was to jettison all the bad features from GCL and start with just the good ones. From there, build outward until the language seemed to hang together. I was totally on-board with taking other ideas I like from places people hate. JavaScript, VisualBasic, XQuery, SQL, and Haskell all provided inspiration, and at least one of those will make a programmer’s stomach turn.

I wasn’t sure what I wanted to make. Part of me wanted to make a viable language. It’s also kind of an art project–a statue made from hubcaps and junk from the side of the road.

I started to design. One of the first things I killed was all the functional programming concepts, despite how much I like functional programming…by removing functions and lambdas entirely. People often use functional and declarative interchangeably. Flabbergast is declarative but definitely not functional. Killing functions necessitated removing the LISP-like map/filter/reduce operations from the language. Most GCL code was heavily dependent on them, but they require functions. I needed an alternative.

The solution was a transfusion of XQuery and SQL. Those languages transform XML documents or tables, respectively. SQL syntax is a clunky product of the 70s, which is why most people hate it. Conceptually, SQL is really amazing: describe the data you want from the data you have and it figures out how to do it. C# has a built-in query system called LINQ, which is definitely the brain child of someone who liked SQL and most C# programmers wax poetic about LINQ and demonise SQL in the same sentence. XQuery features a pleasing FLOWR syntax, almost certainly the source of LINQ’s syntax. Flabbergast’s fricassé expression is basically XQuery’s FLOWR with different a few specialisations.

The Charred Dragon Book

Over a long weekend, I wrote the original Flabbergast interpreter in Vala. It was slow, buggy, and awful, but enough to experiment. I started to play. A lot of the original syntax I designed was ugly and I started to revise it. As I built out the language, I made it a point to view the conventional features of other languages with scepticism. Embrace the weird.

After a few tests, I decided that it was time to write a new compiler. The Vala-based interpreter was very slow (partially due to how Vala manages memory and partially due to my design). The goal was to target the JVM and CLR. Those VMs would be able to do more for optimisation than I could alone.

Normally, a self-respecting compiler is written in the language it compiles (e.g., a C compiler is written in C). It seemed the right place to go. It also promised to be the largest program written in either GCL or Flabbergast. The most difficult question was how to write a compiler in such a strange language. It didn’t have the string manipulation facilities to parse anything.

Then the weird came in. Flabbergast is basically good at rendering things. Not graphics style rendering, but composing different bits of information to generate some expanded form. Think of HTML templating, XSLT, bytecode generation, schema generation, or macro expansion as “rendering” a source file to output. GCL was basically used to take a complicated logical configuration and boil it down to protocol buffers (Google’s JSON equivalent). The Flabbergast “compiler” comes in two parts: an explanation of how the language works built out of small operations and how you would implement those small operations in Java or C#. When you build the compiler it renders the language explanation into a big string: a Java or C# program that knows how to compile Flabbergast, built by composing all those small operations. This is not the conventional way to build a compiler. It works though. In fact, I have added new syntax to Flabbergast without writing any C# or Java.

Functions: The Best Battle to Lose

Not having functions started out as a struggle. There were a few times I was tempted to just add functions, but I found other ways to do what I wanted and beautiful things happened.

GCL had lambdas and templates. Nearly all the things a lambda can do, a template can do…and templates can do more.

A template is a kind of proto-object, where an object in Flabbergast is called a frame and is an immutable dictionary. I decided on the name frame because object comes with too much mental baggage from object-oriented languages. In most OO languages, objects serve the purpose of encapsulating data and mixing code (methods) and data (fields). That’s not true in Flabbergast. Flabbergast doesn’t make that distinction: all attributes in a frame are values but they are computed from an expression. To an object-oriented programmer, that means a frame’s attributes can only be read as if they were fields but each is written like it’s a method taking no parameters.

The dynamic scope causes a template to inherit the context of the place where it is used rather than the context where it was declared (unlike virtually every other programming language where variables are scoped by where they are declared).

This is can legitimately be a problem when two similar frames collaborate because sometimes it’s necessary to access a name from the declaring context that is also used in the consuming context. The solution was to introduce something that looked like a function but was built from templates and lookup. It’s special syntax that instantiates a template but shields it from the consuming context.

I realised this was a lot like the debate between Java’s anonymous inner classes and C#’s delegates. Originally, Java steadfastly did not have anything that looked like a first-class function. When Microsoft began extending Java into what would become C#, they added delegates which were closures that could be called as functions. The Java developers disapproved. Java had a feature called anonymous inner classes (AIC) that had some of the properties of delegates (they were closures) but they were also regular objects rather than a new category of things. AICs really can do anything a delegate can do and much more. The big problem with Java’s AICs was that the syntax to use them was very verbose. The template versus lambda argument is much the same: templates can do everything that lambdas can do, just with more awkward syntax. So, with appropriate syntactic sugar, we can build lambdas out of templates and get simplicity in both design and use. Java waited until Java 8 to add this sugar. I gave in much quicker.

With the syntax of a function call grafted on top of templates, handling a return value was easy: a function-like template had to have a value attribute containing the return value. However, templates have only named arguments (overrides in Flabbergast speak). Most languages have positional arguments when calling a function, so how would the positional arguments get translated to named overrides?

I toyed with ways of unpacking positional arguments into specific attributes via some translation information. This was very awkward and required the programmer to do a lot of tedious work. Some languages do have named arguments, so the easiest thing to do was require all arguments be named. Is the answer to just ditch positional arguments? Almost.

Other languages have variadic functions: a function that can take an arbitrary number of arguments and they can be accessed as a list. Again, Flabbergast embraces the weird for an elegant solution: unnamed arguments are treated as variadic and stuffed into a list named args. That is, positional arguments are always variadic. Any other arguments must be given names in Flabbergast and named arguments can appear in any order. Remembering the correct argument order is simply a thing Flabbergast programmers never have to do. It’s weird, but it kind of nice and certainly less awkward than the situation in Python or R where named and positional arguments can be mixed and yields confusing questions like: can I skip a positional argument if I name it later? In Flabbergast, these two kinds of arguments are non-overlapping.

It did have one immediate problem: variadic functions couldn’t call each other. In Java and C#, variadic arguments are passed as an array. This means there are two ways to call a variadic function: with a bunch of items that get automatically packed into an array or with a single array. If your function takes an array of arrays, things could get confusing, but the static type system can sort out those situations. In Python, the dynamic type system can’t provide any help, so Python gets around this by having special syntax for passing a list as the variadic arguments. PERL does the mirror by special syntax for passing a list as a non-variadic argument, because PERL.

Flabbergast actually has the solution built in: just pass variadic arguments as a named argument called args since that’s really what we’re doing with variadic arguments anyway. The only extra rule the compiler needs is that it’s illegal to have variadic arguments and a named args argument at the same time.

This is when I realised I was learning this language. Maybe discovering it? I kind of stopped inventing it. It had features and I was discovering them and fixing the bugs in my compiler.

Problem Already Solved

One of the other things that I was fighting with was the ability to merge templates. GCL allowed you to “add” two templates which combined all the attributes. This resulted in all kinds of weird problems: it changed the evaluation semantics, often in ways that weren’t comprehensible, couldn’t handle diamond inheritance sensibly, and couldn’t recursively merge nested structures. Still combining templates was so normal in GCL I didn’t see how I was going to avoid it.

Then, I realised that Flabbergast already supported it and I didn’t have to do anything. This really felt like discovery.

It was possible to create a template that modified another template. Think of this as a higher-order function composing existing functions. What’s great about this approach is that it avoids all the weird problems: the evaluation semantics are easily inferred, diamond inheritance isn’t possible, and the user can define a recursive merge strategy if desired.

I ended up naming these kind of templates -ifiers. If you had a template that squared values and then you wanted them summed, you could run it through a “summifier” to produce a sum-of-squares template. In fact, in the standard library, the sum template is the identity template run through the summifier. If you want to make a sum-of-cumulative-products-of-squares, all the pieces are in the standard library.

There’s an entire SQL query builder that’s just an -ifier that either modifies a template capable of doing database lookups or a template for offline query generation.

The Rewards of Jealousy

A feature I struggled with for a long time was type indication. Flabbergast always had simple type checks like x Is Int but it didn’t have an operation like JavaScript’s typeof, Python’s type, PERL’s ref, Ruby’s .class, Java’s .getClass(), or C#’s .GetType(). I wanted such a thing, but I wanted it to be good.

I strongly disliked JavaScript and PERL’s decision to return a string. Python, Ruby, C#, and Java all have operations that return a type type. This makes sense in languages where one could define new types, but that wasn’t possible in Flabbergast, so adding a type type would not be much more useful than returning a string. There’s basically nothing you could do with a type type in Flabbergast other than compare them.

The compiler had a “dispatch” operation it used internally. It would take a value of any type and then run different code for each possible type. I thought about including it directly in the language.

Then I came to the realisation I could use a weird asset of Flabbergast: contextual lookup. Flabbergast’s TypeOf expression is just a lookup, but rather than using a name, it uses the type of a value to generate a name to do a lookup.

That means the TypeOf operation will return whatever type you want; not force a string on you. In one context, you could use it as a check to ask if something is an allowed type and have it return a Boolean. In another context, it could return a string for display, or a number to be consumed by an external program.

In another context, it could look up a template. If you then instantiate that template, that’s dynamic dispatch as a by-product of lookup. I accidentally added dynamic dispatch to the language.

I was surprised and pleased.

SQL: Coming Back Around

Of all the Flabbergast code, I’m most pleased with the SQL library. It started as a lark to be able to consume information from a database in Flabbergast, but took on its own life. The most frustrating thing about relational databases is that the syntax is slightly different between different databases. Because lookup can be used to change templates, it’s possible to create a declaration for a query, set an attribute to hold the implementations of all templates for a particular database vendor, and then substitute in the correct database syntax by lookup. That’s not hugely impressive as that kind of thing is a pretty standard object-oriented practise, but Flabbergast goes further.

Because lookup finds implementations using contextual lookup, unlike an object-oriented language, we don’t need a factory to produce our query pieces. The factory can be implicitly supplied by lookup and we can define the scope of the factory.

It’s also possible to compose the templates into useful SQL sub-expressions. This doesn’t happen much in the SQL library itself–users of that library would do that kind of composition. However, this kind of composition is very present in the compiler: templates are composed to make new templates in a way that looks indistinguishable from a “base” template. For instance, there’s a template for parsing an identifier that looks like a primitive parsing operation, but it’s a complex composition. If you were stuck in factory land, you would have to create a static method that took the factory and created the composition out of objects produced by the factory or create a factory wrapper that added some new methods (and keep on top of the changing API). For the kind of composition in Flabbergast, only the parts of the interface directly required by the composition matter.

The SQL library has all the type conversion rules so it can type-check an SQL query before it is run. It also does aggregation checks to make sure the GROUP BY clauses are correct.

When connected to a live database, it scans the schema and provides the databases’s tables and columns as Flabbergast frames, so a misspelled column name results in an error before the query is run. If an active connection is not available, the schema can be provided.

It even supports operator overloading. The arithmetic expressions can find an overload rule for the type arguments and take appropriate action. SQL has time and interval types that behave differently on different databases, but the library can paper over these differences. In SQLite, there’s no interval type, but there are functions to add specified number of seconds to a time. Flabbergast is able to pretend like an interval exists, check the validity of the expression and then call the appropriate SQLite date manipulation functions.

Escaping

Flabbergast can also do correct, database-specific quoting. While JDBC/ADO.NET driver can properly quote strings, that only works with a live database. With Flabbergast, it’s possible to generate a correctly quoted query and stick it into a shell script or Python source or YAML or XML or JSON and you can nest any of those in each other. In fact, Flabbergast has a very sophisticated quoting mechanism that can properly quote things through many layers of nesting.

The quoting/escaping is a general system of rules for how to manipulate text. It specifies what to do with certain characters or ranges of characters. That lets you write a rule to convert non-printing characters to hex escapes or put a backslash in front of a particular character.

There’s two implementations: one that does the quoting in a Flabbergast program and another that compiles the quoting rules to an AWK script so it can be used in a shell script. Much like choosing a different database vendor, it’s easy to specify the rules once and swap in either implementation.

Conclusion

I’m not sure where Flabbergast goes from here. I have satiated much of my curiosity and I’ve been delighted and surprised along the way. I think it’s a charming language.

While it certainly is more friendly than GCL, it is still weird. It might be weirder since it’s been stripped out some of the more normal features. I think those features lulled you into a false sense of security in GCL since they seemed familiar but didn’t act in familiar ways due to the context.

New programming languages seem to be ever more like old ones. Java, C#, Go, Python, Ruby, Swift, Kotlin, Scala, JavaScript, CoffeeScript, TypeScript, Dart, and Visual Basic are all heading to the same place. As different as those languages are, they are all trending to the same kind of class-based, object-oriented, single-dispatch, lexically-scoped, strongly-typed, memory-managed language with lambdas and syntactic sugar for creation of dictionaries and lists. Yeah, there are some differences on static versus dynamic typing, generics, and operator overloading, but nothing shocking. I don’t think there’s anything wrong with this category of language, but it’s every harder to justify switching for ever smaller incremental gains.

There’s the camp of LISP, ML, OCaml, Clojure, Scheme, Haskell, and F# that is trying to reach an island of pure mathematical intention, which I like, but is also becoming homogeneous. C, shell, and PERL are not really being developed as languages anymore even though they are critical and omnipresent–they have become living fossils. C++ will continue to absorb every feature from every other language that templates can accommodate.

I want something that makes me think about problem solving in new ways.

Rust is new and exciting. Rust’s type system is truly a new perspective on memory management. It makes you think about what you mean and what you really want. Erlang is exciting (sadly, it’s not that new). The actor model it uses is very unusual and really useful in certain scenarios.

I think Flabbergast is new and exciting. Not Rust-level exciting, but more than another attempt to reinvent a better Java or JavaScript. I like that Flabbergast provides more functionality by having fewer underlying parts. Most languages have a dictionary and a list but Flabbergast merges these together. I worried I would come to regret that decision, but over and over, that’s proven to be simple, convenient, powerful, and expressive. The same is true of choosing templates only instead of templates and functions. That makes me happy.

It’s hard for me, already knowing GCL, to know what it would be like to learn Flabbergast if I only knew class-based, object-oriented, single-dispatch, lexically-scoped, strongly-typed, memory-managed languages with lambdas and exceptions and syntactic sugar for creation of dictionaries and lists. A Java programmer’s brain might explode or they might enjoy themselves.

If you want to give it a whirl, you won’t be the first person to learn it.

Embrace the weird. You don’t know what you might discover.