Key Lesson: Building CloudRepo With Clojure
One of the exciting things about starting your own company is that you get to decide which technology stacks to use.
Win or lose, you’re responsible for the outcome, so better not make a mistake, right?
The safe technology stacks to use are the ones that ‘everyone else’ is using.
To me, one of the worst reasons to do anything is because ‘everyone else’ is doing it. That seems like an excuse to turn off your mind and just follow the herd.
If you follow the herd, you’ll end up where the herd ends up. If that’s where you want to be, great more power to you.
It’s not where I will be, nor is it where I plan to take my company.
The herd today uses Java.
We built CloudRepo with Clojure.
We learned some things and we’d like to share them.
The Joy of Clojure
CloudRepo was the first production system that we built, from the ground up, in Clojure. We had been working with Clojure, on and off, for over five years but never had the opportunity to create something with it.
Why did we choose it?
There has always been this faint mysticism regarding alternative programming languages and how it can help you solve problems in a ‘better’ way. Better is a subjective term and we wanted to experience for ourselves what ‘better’ meant for us.
The only way to determine what ‘better’ meant for us was to build something with it, in this case that was CloudRepo.
As an aside, we’re huge fans of Rich Hickey — he has single handedly been the greatest influence on my growth as a software engineer. His talks alone have made me a better engineer, not just in Clojure, but in all other languages that I’ve used since.
Clojure itself has been the second greatest influence on my growth as an engineer. It broke me out of the OO mindset and really broadened my knowledge of programming languages in general.
If part of success is learning (and for us it is) then what could we learn by using Clojure full time?
We decided to find out.
Paradigm Shift
At first, Clojure can feel awkward or that you’re fighting it to do something simple, especially if you come from a solid Object Oriented (OO) background using programming languages like C++, C#, or Java.
This happens because there’s a complete paradigm shift required to move from OO to Clojure.
“Data is the API” is the paradigm that rules in Clojure.
Huh? What does that mean?
It is a term that I heard from a member in the Clojure community and have discovered that it’s the key to leveraging the power of Clojure.
I’ll try to explain briefly now and will provide an example later on in this post:
In Java, our class definitions define the APIs — we spend a lot of time figuring out how our classes integrate with each other. Even data is encapsulated in classes which have their own APIs.
We had spent countless mind numbing hours in our career wiring up classes to each other, writing code like a.setFoo(b.getFoo);
, when we had data of similar form but of different static types.
In Clojure, if you adopt the ‘Data is the API’ paradigm then designing systems is primarily about identifying your data and how it flows through your functions. Data can be represented by a small set of abstractions, in particular a map, and so would reduce the requirement to write the glue code that is so commonplace in Java (a.setFoo(b.getFoo)
, etc.).
Our Experience
As we were building CloudRepo we learned that we were still thinking like OO programmers. Clojure was fighting us along the way and eventually we received the message. As we started to adopt the paradigm shift Clojure became easier to work with.
Today, working with Clojure is a joy — we have learned to think of the system as data flowing through it and how that data is transformed and manipulated.
This is what I believe is meant by ‘Data is the API’.
Programming is no longer a sequence of ‘call this function’, ‘convert to this type’, ‘call this function’, ‘convert’, etc., as you would see in a statically typed language.
We started out using Clojure in exactly that manner and that’s when we felt it was fighting us.
Because it was fighting us.
Because you should not program in Clojure like it is an Object Oriented language.
If you take one thing from this post, I hope it’s that.
Idiomatic Clojure
Idiomatic programming in Clojure is like creating a pipeline of functions:
This is where you can really see the difference between static typing and dynamic typing.
In OO, when your functions all take a different type (ie. data objects or type classes) then you have to write a lot of glue code to manage mappings/conversions between fields.
In Java, you can use lambdas to simplify things a bit but your mapping functions still need to know how Type A maps to Type B.
In Clojure, where most of the data passed around is abstracted by maps, glue code to manage conversions between types are no longer necessary.
If you pass around maps in Clojure and embrace this concept, Clojure will stop fighting you and will open its arms to you like a best friend.
This was a very important lesson which took us longer than it should have to learn. All those years of OO programming had taken quite some time to be overcome.
Now that we’ve learned our lesson and applied it to our code, programming with Clojure feels like a superpower.
Key Concepts — Data as the API
There are a ton of posts like this about Clojure. People seem to figure it out, then post on a blog about how great it is. Our frustration with these posts is that they never really taught us how to think about data flowing through a system.
With almost 20 years of OO experience, it was difficult to change the way our minds approached software problems.
Passing Maps as Parameters
For us, one of the hardest habits to break (because it was a habit we weren’t aware of) was defining functions in a way where all the parameters were declared separately. If you needed three pieces of information, you’d have three parameters. In a static typing world this ensures that people are calling your method properly.
Additionally, if you were working with objects with similar fields and ‘shapes’ you’d still have to define conversions between all the types in use.
This example shows just how much boilerplate is required in Java just to get things connected properly. There is very little program logic present, ie. ‘A lot of Ceremony’.
In Clojure, the idea is that most functions work on a small set of abstractions, one of them being the map (associative) abstraction. If you write functions that accept and return the map abstraction then you can declare the previous Java code in Clojure like so:
This is much more concise and simple, which is what we all strive for our programs to be.
In the example, you can see how Clojure function calls can be threaded together in order to form a pipeline, especially once functions are implemented to accept and return a map.
Data Flow Based Designing
When you design OO based programs, you end up with a lot of sequence diagrams showing the interactions between objects (Google UML Sequence Diagrams for an example).
This is probably why you don’t see a lot of sequence diagrams these days — doing the boilerplate in code is one thing but making boilerplate in diagrams is just dreadful.
Because of the static type system, objects become coupled to each other — they depend on specific types to interact with and if you attempt to substitute a different type, the compiler will prevent you from doing this, even if the shape of the new type is similar (ie. both having an Integer getA()
method but from a different class).
In Clojure, functions call functions — the only difference between functions is the arity — however, if you converge on passing around maps the arity of your functions will become 1, simply the map. Coupling has been moved from the type signatures (object to object) to the shape of the data (ie. passing maps with expected keys :a
.)
In Java, even data classes provide their own set of APIs — there is no common abstraction that is easy to use to represent data in the same way that Clojure does. You have to write custom code to work with each individual data type.
Java does have Maps and you can pass them around like you would in Clojure, but the keys are typed and even if you try to use Strings as the keys, you’re in for a lot of extra work because data classes, abstracted by APIs are idiomatic to Java, Maps are not.
Clojure is functional and because ‘Data is the API’ our modeling is very different. Instead of modeling interactions between objects in the system, we model data and how data passes through the system and evolves while passing through each function.
We haven’t seen any examples of how to diagram this, so we’ve decided to use Data Flow diagrams to help design our programs.
Data as API — Design Example
The following example illustrates the flow of data in CloudRepo when a request to retrieve an artifact arrives.
When a request arrives at CloudRepo to retrieve an artifact, we receive two bits of data :token
and :uri
in our input map and we transform that map until we get to the final result and return the response to the client.
The rectangles are functions and the parallelograms are the data being passed between. Note how the data ‘flows’ through the diagram.
Data flow diagrams have been around for a while. What is new, for us anyway, and the point of this post, is how elegantly this diagram can be converted into Clojure code:
Since the functions all take and return the same type (maps) we can chain them together without any of the glue code that is required in statically typed languages.
This is powerful stuff. It really allows you to spend your time flushing out a design and, when it’s done, translate it into simple code. The code is simpler and thus easier to read, write, debug, etc.
Once we started designing and implementing our code in this manner, a light bulb went off in our heads. This stuff can be incredibly powerful and can save us a tremendous amount of time.
Embrace the Power
Clojure is powerful and can simplify your software development tremendously, but you have to take the time to understand how you’re supposed to use it.
If you approach Clojure with the mindset of an OO programmer, it will fight you, you will not like it, and you will go back to countless hours of mindless boilerplate coding.
Stick with Clojure, converge on maps as the primary abstraction for your data, and it will all click one day.
When that day comes, you and your team will gain a superpower that will put you miles ahead of your competition who will never try Clojure until ‘everyone else is doing it’.
When that day comes, you’ll be light years ahead of them.