From Ruby to Clojure: An Intercom integration story
Re-building integrations with a new stack
Coding without context
As an engineer, I’ve always struggled with the same problem: lack of context. Bug reports would come through from support, product people would push feature requests, and neither would have much context.
Sometimes, I’d see a quote or NPS response from a user and alI I would get is the summary of what the other team thought was important. All of that is valuable, no doubt, but a sentiment often shared by developers is “I don’t know what our customers are saying”. This is what NomNom is trying to change.
NomNom integrates multiple sources of feedback and user research into one place — sources like Intercom, NPS services, mobile app reviews, emails etc., so you can dig out all the insights, straight from the horse’s mouth. We are Intercom users ourselves and love the fact that we can handle support straight from our phones, on the go.
At NomNom, everyone in our team has access to feedback received via Intercom (our primary channel), emails, surveys and user research interviews. Whenever possible, we link to customer conversations which have bug reports, feature requests or general feedback. This also helps our engineers with customer support — everything is in the open, so everybody can contribute.
Building the integration and the need for a new stack
Intercom was one of the first data sources we added to NomNom; my co-founder and I were familiar with the service already so it was a logical choice. At the very beginning of our journey, we went with a familiar stack consisting of Ruby, Rails, Postgres and RabbitMQ.
Everybody will tell you that you should throw away your prototype. At the same time everybody will tell you that many prototypes end up as production software running for years.
NomNom is not that old, so it’s hardly legacy software, but as we started to grow, onboarding more customers and adding more integrations, we ran into several issues and realisations:
- Ruby is great at getting things done fast, but not efficient enough for the long term.
- The data model we came up with initially was OK, but not the data storage (Postgres+JSONB). We realized we were not using the right tool for the job (turns out that frequently creating and updating millions of JSON objects can put a lot of strain on the database) — especially when we were still figuring out our schema and storage needs. After a year we’re now much better informed and might experiment with Postgres again.
- We wanted to squeeze more performance and handle more intensive data processing.
- More customers started using the integration and we got a lot more useful feedback on how to improve it.
In addition to all of the above, we started working on features based on machine learning and natural language processing, and Ruby is lacking in these areas. We evaluated Python and, while it was fast to get off the ground, the deployment story and dependencies management wasn’t good enough for us. All of that led us to the JVM (Java Virtual Machine).
Why we’re not scared of (these (weird :parens))
JVM is not only the runtime for Java, it’s also very stable, battle tested and has a rich ecosystem of great quality libraries. There was one problem though: we didn’t want to write Java.
I had some experience in Clojure and Scheme (fun fact: we run a pretty critical service written in Chicken Scheme in production) and I always was a Lisp fanboy. Completely independently, one of our engineers Afonso suggested “why not use Clojure?”. So, in the spirit of “right tool for the job, I hope”, we went with it.
We started small and implemented a new service that was a part of our data processing pipeline. After an initially rough start (fixed by JVM tuning) we were really impressed with how well it worked. We started quickly adding more components written in Clojure, including natural language processing.
As an aside, this is where many functional language proponents get it wrong: yes, some of us like Lisp’s syntax-less syntax, but in the beginning we didn’t care much about pure functions, category theory or “monads are like burritos”. The thing that we liked was that it is easy to reason about code (less concepts — no classes, objects, instances etc — just functions, values and namespaces) and has great performance.
To cut a long story short: we came for the JVM and we stayed because of everything else.
Making the transition
As parts of our system were moving to Clojure, the biggest change we’ve made was to separate business logic (accounts, users) from our document management part. Ruby is not all bad and we’ve made some good design decisions in the beginning. Our transition wasn’t as painful as we thought it would be.
- Our Ruby code was fairly modular, after two years the top-level structure is still the same but underlying components moved from being clients of various data stores (PG, Redis, ElasticSearch) to clients of various services we run.
- Wherever it made sense, we chose to process data asynchronously — we don’t have real-time (or near-real-time) requirements in most parts of our system.
- RabbitMQ is an amazing tool — with client libs available in every language we could start replacing our worker code without affecting the rest of the system.
The hardest part to move was the system that manages documents ingestion, search and rule engine processing. It was an effective rewrite and reduced our Ruby codebase by 50%.
Once the biggest part of the migration was behind us, it gave us a big speed boost and improved resource utilisation, despite the slow JVM startup. We were able to reduce machine sizes and run fewer of them: native threads, superior VM and optimized language runtime fixed all the memory leaks and performance issues we had experienced before.
The last missing part was the third party integrations code — rather than introducing a new integration, our lead backend engineer Marketa kicked off the migration by porting and improving one of the less-used sources.
Following that we quickly went from one Clojure powered integration to seven and reduced the footprint of our (at this point) legacy Ruby codebase. The latest integration to be migrated was Intercom.
Rewriting our Intercom integration
We work with many APIs — and believe me, I have a lengthy draft about “API worst practices” based on my experience with some companies’ APIs. Luckily, Intercom’s API is a pleasure to work with. All the data we need is present and their platform team is always listening to our feedback.
Last year Intercom started deprecating API Keys in favour of OAuth — a great move for us, because as well as being more secure, it simplified connecting services in NomNom. In practice it was an easy switch for us, as we could transparently handle both types of authentication and inform our users to reconnect their integration at any point.
Another way Intercom’s platform really shines is that it has one of the most comprehensive webhook implementations out there; the list of supported events covers all use cases, the delivery is rock solid and it gives us enough data to defer heavier data fetches and perform them only when necessary.
Our new Intercom integration leverages the fact that our front-end Rails application already received the webhook, verified that it was valid and pushed it to a RabbitMQ queue. This made it really simple to make the switch, we turned off the old code and the new Clojure consumer started picking up the jobs without any other services needing to know anything had changed.
Component is awesome
When reading about large LISP systems, one of the disadvantages that often gets mentioned is the lack of structure and problems managing shared mutable state.
Clojure discourages the latter and provides little guidance in terms of code organization. To address that, we’ve used Stuart Sierra’s Component and built all our integrations around it. This gives us a pattern for structuring our codebase (which is not small at this point), makes it easier to work in the REPL and encourages dependency injection (but in a functional way — no XML here folks!) This was a great learning experience for the team. Marketa will be speaking about Component and its adoption at Dutch Clojure Days 2017.
Metrics everywhere are awesome too
Over time we’ve assembled a set of shared libraries which allow us quick setup of RabbitMQ consumers, http request handlers etc. All of that code comes with built in instrumentation and sends metrics about the number of successes, retries and failures along with timing information.
This way we can see if we’re accepting webhooks in timely manner, how many Intercom conversations we ingest per minute and so on.
Since our public launch last November, we’ve ingested over half a million conversations from Intercom alone. The number keeps growing every day and we recently started ingesting additional data from Intercom about users and their properties.
We’re acquiring new customers and becoming more reliant on Intercom itself for our own support, meaning that we are not only dogfooding NomNom but also our Intercom integration. The best is yet to come 😀
If you’re an engineer, share NomNom with your product team, they will love you for it.