Stream Thinking

Transforming mutable state into
functional processes

I have been writing very small objects in Ruby and JavaScript, each with limited responsibility. These objects typically are initialized with data, and while that data is used during the course of the object’s fleeting life, it doesn’t change. Here is an example in Ruby:

class NumberToWords
attr_reader :number, :descriptions

def initialize(number)
@number = number
@descriptions = []

def to_words
  def build_descriptions
# calls helper methods to build out a series of descriptions
# based on the size of the number and the digits in each place
  def join_descriptions 
# an algorithm that that figures out edge cases
# for joining the descriptions

These one-time-use objects that don’t mutate external state, making them almost immutable objects. Immutable objects are an important concept, because concurrency and parallelism are becoming key to our coding future.

So, what prevents `NumberToWords` from being truly mutable? It is the instance variable @words, which accumulates state gradually through each of the (imaginary) helper methods called in the `build_descriptions` method.

If I imagine this algorithm, it breaks down the number into parts. If we could describe numbers less than a thousand reliably, we could break down any larger number.

756 becomes “seven-hundred fifty-six”.
756,756 becomes “seven-hundred fifty-six thousand and seven-hundred fifty-six”.
… and so on.

In the way that I have outlined it, the algorithm is a map/reduce problem. We split the larger number into three digit smaller numbers. We map over each smaller number to build a description. Then we reduce these back together by joining the parts with their place value descriptions, and where appropriate ‘and’.

Map/reduce as an algorithm was built for concurrency, so how can we transform mutable state of an expanding array into a concurrent happy thing?

Many functional languages have the notions of infinite sequences or a ‘lazy’ sequence. In Clojure, the infinite sequence of integers is simply `integers`. In order not to burn through every bit of memory, Lisp languages also offer the `take` function which picks off a number of elements from the infinite sequence.

(take 5 integers)

This kind of infinite sequence is possible because the sequence can predict its next element.

What we really need is the opposite kind of a structure. We need a sequence of unknown length, potentially infinite that receives elements and responds.

In mid-2012, Node.js introduced streams, inspired by Unix pipes. They are lazy sequences that generate events. Events are attached to callbacks. Node’s heavy reliance on events led to a fair amount of ceremony when dealing directly with streams. For example, it is possible when receiving a stream that there is already content queued and waiting. It is possible that the stream may move to fast and need to be paused. Although streams in Node have evolved and are now in a place where they can receive objects, not just binary data, they are still hard to use.

My passion project since this summer has been inventing a language. I call it Fauxy because it is both faux object-oriented and faux functional. It dreams about hitting that sweet spot between objects and statelessness. I would like Fauxy to have streams as easy to use, easy to create building blocks for transforming mutating state into good concurrency patterns.

number-to-words = -> (number) { 
/* magic conversion happens here */
stream = -> (number) { 
Console.print-line( number-to-words(number) )
// "seven-hundred fifty-six"

Such a stream would do that mapping work, but all that data needs to be aggregated for a reduce process. Map algorithms, though concurrent, are not strictly functional. They amass the individual functional contributions into an unordered and potentially boundless list. They are the underlying data structure I have been sniffing at.

An accumulating stream would encapsulate this mutation, via a slight of hand, a layer of indirection. The stream could then be responsible for preventing concurrent shared memory disasters, via transactional memory or actor patterns or ….

Yesterday I woke to the buzz of Matz, Ruby’s creator, thinking in code about streams in a new language called Streem. It looks like he two is inspired by this Unix pipe process.

Streams might be part of a larger set of concurrency objects with accumulating state that would allow programmers to aggregate functionality into encapsulated ‘objects’.

A full and fully hypothetical version of the NumberToWords problem in Fauxy is available here. Heads up on a syntax gotcha: Fauxy uses multiple dispatch to eliminate `if` statements and `while` loops. So, you will see multiple methods with different signatures or conditional signatures.

Show your support

Clapping shows how much you appreciated Kane Baccigalupi’s story.