Philosophizing Programming

Malina Tran
Tech and the City
Published in
7 min readJul 12, 2016
Artwork by Diego Cardoso

Watching Rich Hickey’s keynote at JVM Languages Summit 2009 took me back in time. Specifically, to the summer of 2003 when I was enrolled in a philosophy course. We ruminated about reality and perception and objects and time. Often when talking about reality, we find ourselves delving into the world of the abstract.

Interestingly, Hickey talks about time as a feature that is both evasive and essential. And here is my attempt at synthesizing one of Hickey’s most popular talks, in which he discusses the future of object-oriented programming.

Hickey dives into some concepts belying programming, most of which is beyond the scope of my own experiences. He introduces basic principles — state, identity, value, etc. — and examines issues surrounding time, parallelism, and concurrency.

Same, Same, But Different

I am not of the camp that a singular programming language is superior than another. That is perhaps the single-most dangerous notion, especially among new developers. (I also read a really good blog post on how this cultivates a contempt culture).

Fellow coding newcomers have professed their love for JavaScript and disdain for Ruby. Perhaps it is because Ruby was my foray into coding and I have been able to explore it more deeply than others. But I do not understand their perspective. Your lack of appreciation, I want to say, is just a demarcation of your limited understanding of the language’s potency.

Hickey hits the mark when he talks about the shared similarities between object-oriented programming languages, such as Java, Python, and C++. He analogizes these languages to cars and how all cars functionally accomplish the same thing (hence the artwork by artist Diego Cardoso). The major difference has to do more with a programmer’s preference for syntax and expressivity. After all, here are shared attributes among OOP languages:

  • A concept of objects based on modeling the real world
  • Mixins/interfaces
  • Static/dynamic typing
  • Semicolons/indentation/blocks
  • Inner classes

Hickey urges us to think about how programming can navigate a complex, concurrent, and heterogeneous world. He talks about “incidental complexity.” This refers to a set of problems that derive from choosing a particular language, tool, or strategy or another. Perhaps the most difficult problems are the ones disguised as being simple. For example, C++ lacks automated memory management, otherwise known as garbage collection (or GC), which refers to memory that is no longer used in the program. On the other hand, Java implements GC and consequentially, has a huge library collection.

Pure Functions, The Bricks of Our House

Large programs can be difficult to maintain as a result of incidental complexity. With each new programming language is the opportunity to address the issue of concurrency. Concurrency refers to concurrent units of a program or algorithm resulting in the same final outcome, despite an out-of-order execution or partial execution.

One of the selling points of OOP is the composability of units. After all, with OOP we can name, store information into, and encapsulate functions in order to build upon them. Pure functions are ideal in many ways: they take immutable values, does stuff locally with them, and returns another immutable value. They are easy to understand, change, test, and compose. And they have no notion of time. (In contrast, objects and methods are not worry-free).

Yet, most programs are not functions; they are processes since they do not return the same result.

No Sense of Time

This is where I feel as if we get into philosophy of code. Objects are simplistic models of the real world. They include behavior that is associated with data. And lastly, they have no concrete notion of time and no proper notion of value. (Functions, on the other hand, don’t deal with time nor do they pretend to deal with time).

What has OOP been doing wrong? It has gotten time wrong. When creating objects, we create an object’s ability to change in place and made objects that allow us to see change in place. There is no concrete notion of time and values. Values are fabricated. Also, we have symbolic reference (identity) with actual entities. The idea that I attach to this thing is the thing that lasts over time. (I know… how trippy). For instance, setting a variable to an integer does not mean that the integer persists over time.

Hickey references mathematician and philosopher Alfred North Whitehead extensively. Whitehead asserts that time is atomic. You cannot touch time; rather, you can observe its sequential changes. All you’re getting is value in a point in time. In other words, you cannot cross the same river twice.

Process creates the future from the past; the future is a function of the past (doesn’t change the past)

To get into more psychological weeds, Hickey states the following:

  • Actual entities are immutable values that are atomic (occurring instantaneously and in isolation from concurrent processes)
  • The future is a function of the past, but it doesn’t change it (process creates the future from the past)
  • Identity is a series of related values, but there is no enduring entity
  • Time is a succession of process events

Breaking It Down

In order for our programs to make decisions, we need to have stable values that can be perceived and remembered. We need to get state and time right in order to model things similar to the way we code. However, the difference between the real world and our model of the world: we don’t get to stop the world when we want to look around and control it. In a concurrent world, everything will proceed as follows, despite what we wish or intend or hope for.

Below is a diagram of how we can conceptualize time. And below that is a list of defined terms that may clarify the diagram.

Screenshot of a slide from Hickey’s keynote at JVM Languages Summit in 2009
  • Value: an immutable magnitude, quantity, number, or composite of these things
  • Identity: a derived concept based on a series of values/states that are causally related over time
  • State: a snapshot of an identity at a moment in time
  • Time: all about relativity and is not measurable or has dimension; rather, it refers to a before or after certain values
  • Perception: observations of the world

This strikes me as particularly prolific. Perception is massively parallel and requires no coordination, communication, and message passing. It does not belong in the timeline. We are always perceiving the past and never perceiving the present. We are calculating based on the past. By the time we’re making a decision about anything, we are observing the past because time is instantaneous.

Bringing It All Together

Hickey talks about what we need in a programming language. We need to have a language represent values and manage changes or succession in value (aka time). We can consume memory to model time. In a pure function, an old value is passed into a new value and that will inevitably consume some memory. What this creates is a visual snapshot that serves as a record, memory, or perception. If we consume memory, the role of GC would be to clean up the no-longer-referenced past since we don’t care about it anymore.

Hickey advocates for persistent data structures (PDS) as the solution. Here are their defining qualities:

  • Immutable, which is ideal for states, snapshots, and memories with stable values for decision making/calculation.
  • They never need synchronization among all the perceivers!
  • When new value is made, it shares structures with the prior but does not disrupt old value or perception of prior value.
  • Reduction in complexity.

Under the hood, PDS are represented by trees. Trees support structural sharing and path copying, with the addition of new nodes. They are easily composable. While one issue with persistent data structures is their slower performance, Hickey states that one way to resolve this is through safe “transient” versions and are as 90% as fast as mutable objects.

Hickey also talks about the importance of emulating time in a program, and driving forward time. What this requires is state succession and the ability to provide a value of a point in time. Multiple timelines is also an important, and desirable, consideration. Different strategies for time construct include: compare-and-swap (CAS), which builds a timeline for each identity; agents, which are asynchronous and not connected to a timeline; and software transactional memory, which allows coordinated timelines between multiple identities and the caller.

Hickey points to a concept, often used to talk about database, as a way to think about concurrency. “Multiversion concurrency control” refers to keeping history to satisfy readers. Its key attribute is that perceivers do not impede process; in a data-related context, readers do not impede writers. This allows observers/readers to have the notion of a timeline by keeping some history. PDS makes history cheap. When our brain reconstructs behavior, we observe a snapshot known as before and snapshot known as later and deduce the delta.

The takeaways from the talk are the following:

  • Excessive complexity requires change
  • Complexity can be atributed to conflation of behavior, state, identity, and time
  • We need to be explicit aboaut time
  • We should be programming with pure functions and immutable values
  • The epochal time model (diagram above) is a general solution
  • We have the current infrastructure for experimentation
  • In the future, we will have to coordinate internal time with external time, have better performance and more parallelism, and have better versions of data structures and more time constructs

--

--