During the past few weekends I was thinking of a problem that came up at work.
Leaving the details out, a decent chunk of the problem has to do with transforming JSON objects dynamically. According to the rules that by themselves are dynamic. According to the dynamic set of rules that themselves have to be expressed in a JSON object.
Given this part of the problem is very much open-ended — and also is not, strictly speaking, 100% an engineering one — I resorted to a TDD approach of assuming certain product usecases would be covered, and designed the end-to-end architecture around this assumption. Ultimately, of course, it has come to figuring out how to best make the very inner engine work.
The first decision, that was already well crystallized in my head by the time to build the engine, is that the engine itself must be a functional, not an imperative one.
Building imperative state machines sucks. It sucks doubly when the ultimate end user is not an algorithms and data nerd to the degree I am one. Also, whether you like it or not, where there is a dynamic, flexible language used by non-engineers, those people would inevitably find hilarious ways to shoot themselves into the foot, which I could not even think of.
So, let’s keep it purely functional. No variables. No global state. No side effects. Pure transformation, from one object into another, perhaps with several steps in between, which by themselves are, of course, purely functional.
Originally I was planning to introduce strong typing to the objects and sub-objects to later be transformed, but have quickly let go of this idea, when the colleagues I have shared early results with were genuinely wondering if I am writing an obfuscation engine.
Very quickly, the above train of thought has come to a simple observation that I only need a very narrow subset of the engine to make the first step.
The nature of this project is that other teams will be working on implementing the rules application engine on their own platforms. So, while I would, of course, provide them with the full spec, several reference implementations, and quite a few test cases, the more compact and self-contained I make the very rule engine language, the more the other teams would thank me later on.
So, it quickly became apparent that some pre-processing step, a transpiler if you with, would be in order. Once I was past this pivotal moment, it was the very next step to realize that any and all “strong typing” I was hoping to introduce would be done before the transpilation phase.
By then it was, thus, exclusively about formulating the minimalistic self-contained version of the engine running the above transpiled code.
Obviously, the next question I asked myself was: should I transpile the input functional code into the output imperative one, or should I stick with the functional paradigm all the way?
In my head, this question has quickly evolved into a different one: Would a functional rule-applying engine be as simple to implement as an imperative one?
I gave it a shot, and it turned out pretty damn well to my taste. Would be curious if someone else could see major flaws which I may have missed. The main ideas of the language are outlined below.
The rule definition language consists exclusively of function definitions.
Zero-parameters functions are constants. Such as pi. Or such as a pre-determined list of JSON keys to be filtered out from the input object.
Other functions have one or more parameters. Such as object or string operations, or comparators, or conditional evaluation, a.k.a. f = g ? x ? y.
The most important piece is that each function consists of exactly one statement. Because if the function originally consisted of several statements, the transpiler can trivially decompose it into the trivial operations before sending it further to the low-level rules application engine.
Add functional operations, such as filter, map, and reduce, into the mix. Complement them with several ways to call the function — by its immediate name, or as returned from another function, and with explicitly provided list of arguments or on an array of arguments — and the language becomes Turing-complete, at least to the best of my understanding.
For the type system, I was lazy enough (and quite short on time, to be honest), and have simply adopted what JSON has to offer: numbers, strings, booleans, null-s, arrays, or objects, which are maps from strings to anything valid. Add type “function” into the list, and it becomes complete.
Last but not least: since there are is no global state and no mutable variables, given the same arguments the same function would evaluate to the same result.
So, the very low-level engine is a lazy evaluation of single function with a fixed input. Yes, this simple.
With the above being said, here are the real questions:
- Did I miss anything obvious?
- Was this done well enough before, so that I don’t re-invent the wheel?
- Why isn’t this idea more widespread?
How to solve the above better would be an interesting conversation to have.