Learning Ruby through Clojure

Ahmad Hammad
Buttercloud Labs
Published in
6 min readSep 10, 2015

My interest in Clojure piqued in late 2013 when I decided to take a Clojure crash course taught by @henrygarner and @jarohen. As a newcomer to Clojure (especially from an OOP background) you are told that there will come a moment when it ‘clicks’ and you just get it. For me the aha! moment was understanding why and how the language and its data structures were designed the way they are; just like when Neo finally sees the Matrix for what it really is — code. The parenthesis faded away and I saw the syntax for what it really is — data. Anyway, this isn’t about my Clojure epiphany but about how it helped me improve my Ruby code.

In this post I’d like to touch on four things I learnt more about in Ruby through Clojure. I don’t aim to cover them in-depth but rather to shed some light on them and point you to resources that do a great job at explaining them in detail.

Destructuring

Destructuring is a pattern of capturing values from a data structure into a set of symbols/variables most commonly within a function’s or block’s argument list.

Clojure provides a strong set of destructuring patterns that are commonly used to provide more readable and concise code. It supports the destructuring of two types of data structures: vectors (Array in Ruby) and maps (Hash in Ruby) or anything that implements the IPersistentVector or IPersistentMap interfaces. (To learn more about Clojure’s awesome destructuring syntax you can checkout this article by @thejayfields).

Ruby only provides one form of destructuring which acts on Arrays, however the splat operator * can also be used in creative ways to help out. I was always aware of Ruby’s destructuring but the effectiveness on code brevity only dawned on me when I started using it in Clojure.

The most basic form:

x, y = [1, 2]
puts x+y
# 3

All we did here was assign x to the first element 1, and y to the second element 2. I find this simple pattern most useful in the following two situations:

1) Returning more than one value from a function:

def get_position
[10, 20, 50]
end
x, y, z = get_position
puts “x = #{x}, y = #{y}, z = #{z}”
# x = 10, y = 20, z = 50

2) Argument destructuring:

peeps = [[“John”, [“male”, 28]], [“Jane”, [“female”,40]]]

peeps.each do |(name, (gender, age))|
puts “#{name} is a #{age} year old #{gender}”
end
# John is a 28 year old male
# Jane is a 40 year old female

The rest of the patterns are all based on the aforementioned two with some splatting help. @pitluga does a great job at digging into this topic here.

Sets

Clojure’s syntax by design revolves heavily around its collection data structures, as a result you have no choice but to get intimately familiar with their purpose and underlying design in order to write efficient Clojure code. These include Lists ‘(), Vectors [], Maps {} and Sets #{}.

As you would imagine Sets are based on the mathematical set, which is basically a collection of distinct items. In Ruby, unlike Arrays [] and Hashes {}, Sets don’t have a literal syntax representation; instead you can create a set in one of two ways:

Set.new([:a, 1, “hey”, :a])# #<Set: {:a, 1, “hey”}>

or

[:a, 1, “hey”, :a].to_set# #<Set: {:a, 1, “hey”}>

Notice that in both cases the set will be instantiated with only one :a symbol in it. It simply throws away any duplicates checked using Object#eql? with no warning.

An important thing to note here is that Ruby Sets store their values internally in a Hash. This is the Ruby implementation of the Set#add function:

def add(o)
@hash[o] = true
self
end

So when we create a Set like this:

s = [:a, :b, :c].to_set

it will be internally represented as:

{a: true, b: true, c: true}

What does that mean for us? Well for starters it means that calling Set#include? or Set#member? will simply call Hash#include? on the internal Hash instance, and therefor we get the advantage of extremely fast Hash lookups! Read about why Hash Lookups are fast in this detailed explanation from the guys at Engine Yard.

Given what we’ve learnt so far about Sets in Ruby, we can already deduce that using a Set is superior to using an Array when:

a) We only want to store distinct values. This saves us from polluting our code with Array#uniq. Resulting in faster and cleaner code.

b) We need fast existential lookups on a list of values. This can be slightly less verbose than creating a lookup Hash.

Keep in mind that unlike an Array, a Set does not guarantee the order of the elements; for that you can use the SortedSet.

Another important thing to mention is that Sets are Enumerable, so you get all those nifty collection functions you are so used to as well as the mathematical Set related functions such as Set#subset?, Set#superset? and the likes which you can discover in the docs.

Lazy Sequences

Lazy sequences or enumerations were introduced to Ruby in the 2.0.0 release however I’ve rarely seen it put to use. In Clojure, most core sequence functions are implicitly lazy, in fact most functional languages are either fully or heavily lazy.

Laziness has several characteristics and use cases. Lazy Enumerations allow you to easily work with large (including infinite) collections by passing or executing one item at a time. What does that mean? Doing something like this in non lazy Ruby will result in an infinite loop:

# Given an infinite range, return the result of n+1
# for the first 10 items
(1..Float::INFINITY).map { |n| n+1 }.first(10)
# Don't hold your breath

Even though we only want the first 10 numbers, Enumerable#first doesn’t actually get called until Enumerable#map is done executing the block on every item in the range, which in this case is infinite, so it will never reach first(10).

To fix this we can call the Enumerable#lazy method on our range:

(1..Float::INFINITY).lazy.map { |n| n+1 }.first(10)
=> [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

This basically turns it into a conveyor belt passing on each value one by one as opposed to working on the entire batch and then dumping it all to the next function in the chain.

For more use cases and a detailed drill down check out @pat_shaughnessy’s great article “Ruby 2.0 Works hard so you can be lazy” on laziness in Ruby 2.0.

Immutability

Immutability is at the core of Clojure and one of the many things it handles really well. What is immutability exactly? Simply put, an object that can’t be changed once it has been created is considered immutable. Why would we need that? Immutability offers some interesting tradeoffs, some of which include: peace of mind (for the developer), simpler concurrency, and sometimes performance boosts.

What do I mean by peace of mind? How many times have you pulled your hair out trying to figure out how and where the value in your variable changed? This most often happens with instance variables in Rails controllers that trickle down through the hierarchy of child controllers and down to the views and any partials rendered in those views. The value of that variable could have been changed multiple times by different parts of code along the way and the only way to find out is to trace back every occurrence of it in the code it is scoped in, place a few breakpoints or print values left and right. This can be avoided by explicitly cloning the object whenever you need it to change and returning a new value, or by using an immutable data type.

Immutability can help with performance too, for example using the Ruby Object#freeze in certain situations, such as string Hash keys, can speed things up as demonstrated by @schneems here.

Some more useful resources to dig deeper into immutability:

--

--

Ahmad Hammad
Buttercloud Labs

Anglo-Palestinian currently in London. Partner & Software Engineer at ButterCloud