Essence Algebras

Troy O'Neal
9 min readSep 21, 2020

--

Let’s start with a question, in the list of five objects [red, rEd, rED, green, GREEN], which pairs of objects are equal? Most would say that “red” is not equal to “green”, but there may be debate on whether “red” is equal to “rED”. Can we settle the debate? In this case, one approach would be to consider two “equivalence classes”

  1. [red, rEd, rED]
  2. [green, GREEN]

where two objects are considered equal if and only if they are in the same equivalence class. This yields a precise, crisp notion of equality.

There are a lot of applications where only the equivalence class of an object “matters”, and the particular object does not. However, it is often quite hard to directly represent the equivalence class in a computer, and it instead must be represented as a particular object of that class. We might notice that despite this limitation, we can often still make useful progress.

For example, these objects could be concrete representations of computer programs (e.g. source code), and the equivalence classes are behaviors of the program (note a well-engineered compiler preserves the equivalence class through various transformations). The objects could be mathematical expressions, and the equivalence classes the logical or algebraic “statements” which those expressions represent. Often we desire to manipulate an expression such that its “meaning” is preserved.

Notably, we don’t need to stop in the realm of math or programming objects. The objects under discussion can be quite abstract, and might represent a large variety of things, for example they could represent color, shape, texture, sound, etc. (try to think of equivalence classes for these, you will often find that you are able to do so easily). Given that there are so many different types of objects and equivalence classes, it might seem like a hopeless task to say anything in general about these systems. But surprisingly, it seems some progress can made, which is the topic of this post.

Making Progress: Essences and Expressions

We will now give special, technical names to the objects and equivalence classes described above. Namely, an equivalence class will be referred to as an “essence”, and an object will be referred to as an “expression”.

Now we’re ready to define the framework of “essence algebras”. In, ahem, essence:

Definition: An essence algebra is a set of “essences” and a set of “expressions”, where each expression has exactly one essence.

Here are some typical additional things we can attach to essence algebras:

  1. Essence operators. These input one or more essences, and output another essence.
  2. A simplification operator. This inputs an expression, and outputs another expression with the same essence as the input.
  3. External operators. These input an essence, and output some arbitrary value.

The key motivation for essence algebras is that we can often productively manipulate expressions to say things about essences without ever “writing down” those essences. Specifically, we can often compute external operators directly on expressions, bypassing the conversion into an essence. The simplification operator facilitates this process by making the input expression more computationally tractable.

Some notes. Sometimes what I’m calling an essence might be called a “value” by others. I’m choosing to call it something different to avoid confusion with values which are assigned to variables in many examples of essence algebras (which are _not_ the same thing as an essence), and to emphasize the fact that an essence is typically “more complicated” than a value.

The concept of an essence is also quite similar to the concept of a “type” in programming language theory. However, an essence tends to be much more specific than a type; a typical type might be “integer”, and a typical essence might be “a function that returns its argument squared” (although some higher-order type theories may get closer to the notion of essence). In fact, it appears that simple type systems can be expressed as essence algebras, making essence algebras the more general notion (experts, please comment!).

Up until this point, this definition might seem extremely vague, general, and useless. Let’s look at some concrete examples of essence algebras.

Integer Arithmetic

Basic operations on integers. Note: This is not full arithmetic. It’s a stripped-down version containing only addition and multiplication for simplicity’s sake.

Essences: All integers. E.g., -2, -1, 0, 1, 2, etc.

Expressions: The set of objects produced by the following formal grammar:

An expression is one of:

  1. Primitive (i.e., just an integer)
  2. (Expression + Expression)
  3. (Expression * Expression)

Example expressions: 1, 5, 1+3, 2 * 4, (2 + (4 * 5)), etc.

Essence operators: Addition and multiplication.

Simplification operator: Function which “evaluates” an expression. E.g., given the expression 1+1, it returns the primitive 2. Here, it’s trivial to write a program that can evaluate any (finite) expression into a primitive integer.

External operators: Nothing too interesting here, but a trivial one would be getting the integer value of an essence in some external integer representation, e.g. 64-bit signed integer.

Ok, that was a neat simple example. What else can we do?

Let’s move to high school and do some (elementary) algebra.

Symbolic Expressions Over Real Variables

These are a bit difficult to define in prose, but they roughly correspond to the types of symbolic objects manipulated by high school students during an elementary algebra or calculus class. Let’s look at how to frame them in an essence algebra.

Essences: Mappings of all assignments of variables to some output type. Example essences:

  1. f(x): real -> real := x + 1
  2. f(x): real -> bool := (x == 3)
  3. f(x, y): (real, real) -> real := x * y + 4

Expressions: the set of objects produced by the following formal grammar:

An expression is one of:

  1. Primitive real number (e.g. 1)
  2. Variable (e.g. x)
  3. (Expression + Expression)
  4. (Expression * Expression)
  5. (Expression == Expression) (equality)
  6. AND(Expression, Expression) (logical conjunction)
  7. OR(Expression, Expression) (logical disjunction)

Note: there are ignored type issues (e.g. AND of two integers makes no sense) which are unimportant for the current discussion, but can be solved in a few ways, e.g. by being more careful about the formal grammar definition (introducing a BooleanExpression etc), or introducing a TypeError result type.

Essence operators: We won’t focus on them, but they exist.

Simplification operator: There are many choices here, but the general flavor is that they can do “expression simplification” as a typical high school student might. Example, given the expression 1 + x + 2, the simplification operator might output the expression x + 3. Note the essence of the expression is precisely preserved (i.e., for all possible values of x, the result of the expression is the same real number. Strictly speaking, we need some entity that reduces the substituted expressions into a final real number to prove that this is the case (e.g. reducing (1 + 4 + 2) into 7, but we’ll ignore this detail)). The other typical property of the simplification operator is that the complexity of the expression is reduced (typically meaning that we need less symbols and levels of nesting to represent the same essence). Here, we went from “1 + x + 2” (5 symbols) to “x + 3” (3 symbols), an improvement of 2 symbols.

External operators: There are a lot of possibilities here. One example is the operator that computes the “simplest form” of an essence, outputting an expression. E.g. the essence of “2*x — x” would map to “x”. Another is an operator that “solves” an equation (outputting real values for all arguments to the essence). One might also study operators that output whether a (boolean-valued) essence is logically consistent (i.e., satisfiable).

What else can we do with essence algebras?

Distributions

Here, we are using the word distribution in a specific sense. Roughly speaking, define a “distribution” as a probability distribution in the traditional sense, but removing the constraint that all probabilities sum to one. These distributions are often useful for talking about multiple simultaneous sub-objects (e.g. multiple possible outcomes of an experiment, multiple types of assets owned by a portfolio, or multiple answers to a question). Such distributions can also be sensibly “sampled” to produce random sequences of values. So how can we frame them in terms of an essence algebra?

Essences: Mappings of a “sample value” to a real-valued “weight”.

Example essences:

  1. f(x): integer -> real := (x == 0 ? 1.0 : (x == 2 ? 3.0 : 0.0 )). This is pseudocode for: “if x is 0, output 1, if x is 2, output 3, else output 0”. This could be thought of as a discrete mass distribution, or, loosely, an (unnormalized) discrete probability mass function.
  2. f(x): RealInterval -> Real := Integrate(GaussianPdf(z), dz, from: z=x.start, to: z=x.end). This is one way of representing a continuous gaussian distribution over a single real variable. It inputs a RealInterval (consisting of a start and end) and outputs a real-valued “mass” for that interval by integrating the gaussian probability density function.

Expressions: the set of objects produced by the following formal grammar:

An expression is one of:

  1. Impulse. (define this as a function that maps a single sample space value to a weight)
  2. ScalarWeighting(Expression, weight), where the weight is a real number. This means to multiply the weights of all sample space values in the input expression by the given weight.
  3. Basket(Expression0, Expression1, …), where each weight is a real number. This represents the “addition” of sub-distributions.
  4. (Expression * Expression) (this represents the Cartesian product of all sample space values in the two input expressions. There are a bunch of details as to what this actually means but they aren’t important.)

Essence operators: At least the addition and product operations defined above.

Simplification operator: There are many choices here. For the case of discrete distributions (i.e., those with a countable number of sample space values), we can hack together a quick prototype along the following lines. Imagine our input expression is Basket(Basket(“x”: 1, “y”: 2), “x”: 3) (where the sample space values are “x” and “y”). We can simplify this input expression into a single, “flattened” Basket: {“x”: 4, “y”: 2}. This is done simply by aggregating weights for all encountered sample space values.

External operators: Again, there are a lot of choices, but one interesting one is the idea of “sampling” a distribution, which outputs a particular sample space value chosen “randomly”.

There’s one more example of an essence algebra that I found useful, and it was the one that pushed me over the edge to write this blog post.

Geometric Regions

Essences: Mappings of N-dimensional real-valued vector to a boolean. These can be thought of regions of space, and include lines, triangles, squares, spheres, etc.

Example essences:

  1. f(x): real->bool = 1 < x < 2 (this is a line segment of length 1).
  2. f(x, y): (real, real) -> AND(0 < x < 10, 5 < y < 15) (this is a (filled) square with area 100 whose center is at (x,y)=(5, 10))

Expressions: the set of objects produced by the following formal grammar:

An expression is one of:

  1. Primitive region. Some given computable function f: real^N -> bool.
  2. Intersection(Expression, Expression)
  3. Union(Expression, Expression)

Simplification operator: This heavily depends on the representation of primitive regions, but we can sketch an example. Imagine all primitive regions are represented as 2D boxes. Note the intersection of two 2D boxes is either empty or another 2D box (ignore the degenerate cases as they can technically be represented as zero-length 2D boxes themselves). Thus, if we have an input expression AND(Box2D(0 < x < 10, 0 < y < 10), Box2D(5 < x < 15, 5 < y < 15)), a reasonable simplification operator might output Box2D(5 < x < 10, 5 < y < 10).

External operators: Several possibilities again, but one simple and useful example is the hypervolume operator. (For one dimension, this is the length, for two dimensions it’s the area, for three dimensions it’s the volume, etc).

Final Thoughts

Essence algebras can often be nicely expressed in traditional object-oriented programming languages like Python or C++. Specifically, each production rule in the expression grammar corresponds a class. An instance of one of those classes corresponds to a particular expression. It is often useful to enforce immutability on the expression instances to make it easier to reason about their manipulation. Using Python and integer arithmetic as an example, we’d have a base class IntegerExpression, a class Integer(IntegerExpression) (representing a primitive integer expression), a class Add(IntegerExpression) with two instance IntegerExpression variables, etc. We’d have a static function simplify(x: IntegerExpression) -> IntegerExpression. And we could have an external operator extract_int(x: IntegerExpression) -> int.

One might notice that the simplification operator and external operators for an arbitrary essence algebra can often be implemented surprisingly straightforwardly in your favorite programming language by following this framework. I, at least, found it interesting that so many mathematical systems could be expressed nicely within it. I have used this framework as the foundation for several experimental projects, including a functional programming language and a mathematical prover. I hope you found this topic interesting. Thanks for reading.

--

--