Dependency Disease: A lack of control, especially in decisions around third-party software, results in businesses that look healthy but suffer internally from “dependency disease” rotting the decision-making process from the inside out.
It’s 9:15 am and I’m alighting the train at London Waterloo station. I’ve missed the morning standup. My train was delayed… again. More often than not, the source of the problem is usually beyond the operators’ control. A dependency on which their system can simply not operate without. Bringing about a degraded service and some very disgruntled customers.
When we think of the software we build as a service, we should aim to avoid outages and keep our customers satisfied. It is important to realise that many of your customers don’t care about the internals of your system. They only care about their side of the bargain and rightfully so.
Don’t worry, this article isn’t about delayed or cancelled trains. If you’re interested in Bitcoin and production systems then keep reading.
Reinventing the wheel
I remember one of my first programming lectures, we were advised not to reinvent the wheel. Rewriting a module for rendering rectangles onto the screen has already been implemented and extensively tested. Do not waste your time. Focus on the problem YOU are trying to solve. Less work? Great.
While this is true for many tasks within engineering, it depends on the domain you’re operating in. Some of the problems you encounter may be entirely new, warranting the creation of a new solution to fill the gap.
An issue arises when you don’t realise this sooner rather than later. You end up choosing a substandard service that loosely solves your problem but not completely. You’re now dependent upon a module/service that isn’t really the right one.
You might ask, where is your due diligence in all of this? Making sure that you inspect a service and its capabilities before you integrate. Reducing the likelihood of a major refactor or regression down the line. In the real world, we often need to time box these tasks. It may have fit at the time, but is it still the most favourable solution when requirements change and our codebase evolves?
Bitcoin in Production
To give a more concrete example, we recently released our Bitcoin capability with Trustology.
Naturally, we wanted to give our customers a seamless experience while providing them with all the juicy Bitcoin features you’d expect from a high-quality cryptocurrency application. Like many teams, we defined a roadmap containing the Bitcoin features that we’d like to offer in the medium-short term. We did some due diligence and technical research on a third-party and voilà, we had Bitcoin! On to the next feature…
Well, in a perfect world this might have been the case.
Let’s rewind. Remember how we said that as a codebase evolves, the earlier solution may become unsuited to the problem space. As new requirements demand more of the original design, we begin to see how a brittle dependency can start to fall apart. Our next Bitcoin feature was able to highlight our architectural shortcomings quite well…
Avoiding address reuse:
Address reuse harms the privacy of participants on the Bitcoin network. There are many reasons to avoid address reuse. With this in mind, we knew it was important to protect our customers.
Our existing third-party provider indexed all addresses on the Bitcoin blockchain. Now that we had a one-to-many relationship between customers and addresses, the number of API calls we had to make increased exponentially. This is because their API was designed to be accessible only at an address level, not useful when you’re customers can have any number of addresses.
Our reliance on a third-party to give us important chain data about addresses (UTXOs for utxo selection and balance calculation) was becoming an impediment, not to mention expensive and slow.
Within a few hours of releasing our new privacy feature, we encountered drastic API throttling. Leaving us with some very difficult decisions.
A top-tier API plan from our third-party cost approximately £1,800/month for 15,000 requests/hour. Let’s crunch the numbers:
Imagine you have 10,000 customers and a modest 5% of these customers are trading Bitcoin within a 60-minute window:
- Each customer has 20 addresses
- Each customer retrieves their balance just twice in the 60 minutes (40 API calls assuming no caching — 500*20*2=20,000 requests)
We’ve already hit 20,000 requests using a very conservative estimation and the users still haven’t even signed any Bitcoin transactions. Cue the disgruntled customers!
It’s worth mentioning that you can import hierarchical deterministic wallets into this particular service. Doing so would allow you to significantly reduce the number of API request you make for a typical customer session. However, choosing to integrate deeper felt like the wrong path to follow. As discussed in an earlier blog post, we know the potential issues that can arise from leaking extended public keys. Also, giving an extended public key to a third-party that can generate any address in your wallet doesn’t bode well for our new privacy feature! 😬
Any company who plans to be on the cutting edge of their domain should think twice about where their core business data resides. Choosing a third-party means that you’re inherently dependant on the responsiveness of others to new market trends. Is this the right precedent to set? 🤔
We built our bespoke Bitcoin indexer. Now we only pay for what we use, can scale indefinitely and have virtually zero outages (insert Bitcoin scalability joke here).
Running your own Bitcoin node isn’t always feasible, regardless of what many Bitcoin enthusiasts tell you. For us, it made sense both financially and operationally.
In the last 5 years, we’ve seen many startups become quite successful from offering chain indexing services. Most of which started as simple block explorers before evolving into complex APIs. At their foundation, most employ what I like to call ‘Just in Case’ indexing. This is where their backend systems will index data for every address on the blockchain, even if that address isn’t owned by a paying customer. Obviously, this is an essential requirement of operating a block explorer, but it may cause more issues down the line when you have to provide realtime and reliable data to your paying API consumers.
Issues will undoubtedly arise as a byproduct of this increased complexity. In our experience, these providers repeatedly faced system-wide outages due to database indexing issues. It was not uncommon for these outages to span UK office hours due to the time zone differences between the UK and USA.
Just to reiterate its complexity, ₿itcoin Core has already seen 4 failed pull request attempts to natively support ‘Just in Case’ indexing. There are also other solutions out there like BitPay’s Insight API, but again this wasn’t quite right for us.
Trustology’s Bitcoin indexer operates off a model that I call, ‘Just in Time’ indexing 🕘. When a user generates a new receive or change address for each transaction, we will add this to our set of addresses to track. From this information, we can focus only on the addresses that we care about for now. Providing a plethora of insightful information to the user. Key engineering decisions such as building our own indexing service have allowed us to capture valuable learning experiences from these nascent technologies. Enabling our teams to build a wealth of knowledge in-house.
Over the last 20 months, I’ve seen our team grow from 4 to 20+ people. Within that time I’ve worked with an extremely talented group of people. It’s exciting to think about what the coming months have in store for Trustology.
Thanks for reading, make sure to follow me if you’d like to read more about ‘Bitcoin in Production’ or posts similar to this one 🔥