BitSocket 2.0

Build realtime bitcoin apps.

_unwriter
13 min readJan 14, 2020

We are releasing BitSocket 2.0, a production grade global bitcoin push notification API service, completely re-designed from scratch for unbounded scalability. I originally launched the very first version 1.0 of Bitsocket in October of 2018.

Because Bitsocket provided a powerful programmable realtime filter of the Bitcoin peer network, it gained a lot of attention and usage, and have even been forked by developers from several other blockchains for their own usage.

But the problem is, the original model of Bitsocket (Bitsocket 1.0) doesn’t scale in the world of gigabyte+ blocks.

BitSocket 2.0 was rebuilt from scratch to fix many of the fundamentally critical problems with Bitsocket 1.0, including user experience, reliability, scalability, consistency, and more.

For those of you alt-coin developers who have been using Bitsocket forks (or similar approaches) to subscribe to various patterns of global realtime events for your blockchain networks, you will probably need to make a choice:

  1. Make sure your blockchain never scales to gigabyte+ blocks, so you don’t run into scaling issues.
  2. Give up on realtime global query filters on whichever blockchain you’re working on.
  3. Come join Bitcoin, and use the new Bitsocket 2.0, whose novel new paradigm of event processing can scale with no limits.

Now, let me explain what this new Bitsocket 2.0 is all about.

Table of Contents

  1. What is Bitsocket?
  2. Problems with Bitsocket 1.0
  3. Bitsocket 2.0
  4. Why this matters
  5. Conclusion

What is Bitsocket?

BitSocket is a programmable realtime push notification service for Bitcoin transactions. Instead of having to run your own Bitcoin peer in order to filter only the transactions that matter to you among thousands (soon to be millions) of realtime transactions on the Bitcoin peer network, you can simply subscribe to a single query stream which notifies you with ONLY the relevant transactions you are interested in.

This way your machine doesn’t have to waste the network traffic and computational power required to run all the filtering operations, yet still have access to the entire global realtime transaction events on the Bitcoin network.

Bitsocket is already being used by various Bitcoin applications you are already familiar with, such as Bitcoinblocks.live, Bitstagram, Bitgraph, etc., and with Bitsocket 2.0, we will start to see a whole new range of applications which require more robust realtime event processing.

Bitcoinblocks.live

Bitcoinblocks.live has become one of the most popular BSV services lately. This app is also powered by Bitsocket.

BSV Matrix

Very simple yet cool demo to turn transactions into a Matrix-like animation.

Bitgraph

Bitgraph uses Bitsocket to render transaction graphs in realtime.

bitgraph.network

You can check it out here:

Problems with Bitsocket 1.0

While the concept of Bitsocket is very cool, it has had a lot of problems which made it hard for applications to fully take advantage of the API. Let’s go through each.

1. Unscalable

There have been multiple iterations of Bitsocket, each improving upon its predecessor. But all of them had the same high level architecture, as shown below:

Basically, Bitsocket would monitor the Bitcoin network, and every time there’s a new transaction, it would run the transaction through all its existing filters to determine whether the transaction passes each filter. If it does, it would send an SSE (Server Sent Events) to each subscribed client.

The main problem with this approach is its scalability. As the diagram suggests, this event processing model has the computation complexity of O(m*n). But this is not the full picture. The transaction volume m itself can be and will grow exponentially, which will make the entire model extremely inefficient. The O(m*n) would be basically O(c^x*n).

Theories aside, this has been a very real problem already and has been foreshadowed by several past stress tests. Whenever the transaction volume increased significantly, Bitsocket event processors would fail to catch up to the incoming transaction speed because it has to process every event individually. Also, the events would often be delivered much later than when they should have triggered, because the event processing tasks take way too much time. This beats the purpose of “realtime” socket.

There has to be a better way.

2. Incomplete

Another critical problem with Bitsocket 1.0 has been that all events are ephemeral, and clients are not guaranteed to receive every event that has occurred. When you were not there when an event happened, you miss out on it because there is no persistence mechanism.

This is especially critical for use cases where a client device often goes offline. Whenever a device goes offline, it misses all the events that occurred while it was offline, as shown below:

Because the client cannot fully trust that it will get every event, it must still implement an occasional polling logic to make sure it’s completely in sync. This again, is inefficient. This aspect is very important for edge computing and “Internet of Things” Bitcoin applications of today because they can’t assume highly reliable connections.

But this goes further than just devices with spotty connections. The fact that clients cannot fully trust the Bitsocket host for delivering 100% of the relevant events, means Bitsocket is not fault tolerant and businesses cannot trust Bitsocket enough to power their applications with it.

There needs to be a way to make sure all clients NEVER miss a relevant event in any situation.

3. Inconsistent

One of the benefits of Bitsocket is that you can use the same Bitquery used by Planaria nodes to filter events.

But here’s the thing: While this is theoretically a very powerful feature, it didn’t exactly work this way in practice. Sometimes Bitquery requests and bitsocket subscriptions delivered different results.

This is because BitSocket 1.0 has not been using a database to filter events. It has been using a library called Mingo which emulates MongoDB queries for in-memory objects instead of actually using a MongoDB.

This approach works in most simple cases, but the more complex a query becomes the more inconsistencies we could spot because it’s just an emulation. For example, for BOB serialization format, several people have reported Bitsocket filtered events not matching the corresponding Planaria query results with the same bitquery.

4. Unreliable

If you are a Planaria API user and recognize the following screen, I would like to thank you for bearing with me through all those months:

Until now, Planaria APIs have never been fully reliable, and I have never promoted them as such. This was because Planaria has always been a one man show with not enough resources. Every Planaria node I’ve been running has been each running on a single server with no automated maintenance.

Using Bitsocket as an example, the socket would occasionally get stuck for a while when there were too many incoming transaction events because the job queue never manages to catch up to the incoming event speed. Also, the server would sometimes crash, and I had to personally keep an eye on it and restart the server whenever it happened. Thanks to the people on Atlantis slack I would find out about these issues before it’s too late, but still the infrastructure needed to be much more reliable.

Bitsocket 2.0

Welcome to Bitsocket 2.0. It solves everything.

1. Scalable Architecture

To address the scalability challenge discussed above, Bitsocket 2.0 was rewritten from scratch, with a completely different paradigm of event processing. Let’s compare with Bitsocket 1.0 to see what has changed:

Bitsocket 1.0

In Bitsocket 1.0, every event was processed individually for every filter, and this resulted in the unmanageable explosion of in-memory event processing tasks with a complexity of O(m*n). And as m grows exponentially, the event processor would fail to catch up to the speed of incoming transactions, leading to huge delays in event delivery. Sometimes, the job queue would grow larger than the Planaria database itself. Just to review, here’s what it looked like (Every event processing task takes place in-memory):

Bitsocket 2.0

Bitsocket 2.0 removes this problem by unbundling incoming events from outgoing push notifications. Instead of Bitsocket directly interfacing with the Bitcoin network, it interfaces with Planaria which functions as a realtime event database:

For example, if there were 1,000,000 new transactions per second, this would have been enough to kill (or slow down) the old Bitsocket 1.0 engine with 1000 clients, because that would mean 1,000,000 * 1,000 = 1,000,000,000 event processing tasks must be processed PER SECOND.

How about Bitsocket 2.0? Because the indexing and event processing are unbundled, each module (Planaria and Bitsocket) only needs to focus on its own tasks. Using the same example, now Bitsocket only needs to process 1000 tasks per second. Bitsocket scales just fine in linear relationship along with the client session size:O(n).

Also, this goes both ways. The new unbundled architecture not only makes the Bitsocket side efficient, but also takes a huge load off of the Planaria side because now Planaria only needs to focus on writing O(m). With Bitsocket 1.0, the occasional stress tests would not only slow down the socket notifications but also slow down the entire planaria database response time for ordinary queries. This is no longer the case since these two modules are now separate. Best of both worlds!

2. Never Miss an Event

Bitsocket 2.0 makes use of a powerful built-in feature of SSE (Server Sent Events) called Last-Event-ID which lets each client remember the last “checkpoint” of the stream and automatically continue where it left off when it comes back from offline.

This, along with the new persistent event database architecture of Bitsocket, means events are no longer ephemeral, and therefore you will NEVER miss an event!

Here’s how it works:

In the following example, a client makes an SSE subscription request to the /s/eyJ2IjozLCJxIjp7ImZpbmQiOnt9LCJwcm9qZWN0Ijp7InR4LmgiOjEsInRpbWVzdGFtcCI6MSwib3V0LnMxIjoxfX19 endpoint with a Last-Event-ID of 5e167232dad37a1127611446, to which the server can send customized notifications:

=> Request
GET /s/eyJ2IjozLCJxIjp7ImZpbmQiOnt9LCJwcm9qZWN0Ijp7InR4LmgiOjEsInRpbWVzdGFtcCI6MSwib3V0LnMxIjoxfX19 HTTP/1.1
Host: txo.bitsocket.network
Accept: text/event-stream
Last-Event-ID: 5e167232dad37a1127611446

Everytime the Bitsocket server sends a new notification event, it updates the Last-Event-ID on the client side. Using this property, a Bitsocket 2.0 client always keeps track of the last timestamp of the batch of events it has received. And this timestamp updates every time the client receives a new notification, as seen below:

The id field represents the Last-Event-ID header, which is always a unique incrementing value. Note that the first request doesn’t have an id and shows undefined. This is because it only gets the last event id timestamp once it starts getting events.

On the server side, Bitsocket looks at these ID values and ensures that the notification returns the entire batch of events which:

  1. match the query filter
  2. and have happened since the ID timestamp

Basically it takes the original query and creates a conjunction query by combining it with “greater than the last timestamp” condition, and then streams it in realtime.

For example, if your client went offline for 10 minutes and just came back online, it would resume the connection with the last event id it remembers. Then Bitsocket would notice the checkpoint and start streaming all events whose ID timestamp is greater than the ID value sent by your client.

And just like this, Bitsocket 2.0 can guarantee a complete delivery of all relevant events for your applications, and you will never miss an event.

3. 100% Consistent with Planaria

The most important innovation with Bitsocket 2.0 is that Bitsocket is now powered by a persistent event database (Previously all events were ephemerally processed, filtered, and emitted without a persistent event database).

This means the Bitsocket and Planaria database are exactly the same. There is no way an event may be triggered but doesn’t exist in the corresponding Planaria db, and there is no way that a transaction exists in a Planaria db but a corresponding event is never triggered, because now the event comes straight from the database.

With Bitsocket 2.0, the same Bitquery can be used to both:

  1. Query Planaria
  2. Subscribe to Bitsocket events

And they would return 100% identical results over time. They are 100% identical because they both use the Planaria database.

4. More Powerful Queries

Because Bitsocket 1.0 filters worked by emulating MongoDB queries, a lot of more complex queries either didn’t work, or didn’t work accurately enough to be reliable.

Since Bitsocket 2.0 directly makes use of the event database powered by a true MongoDB backend, now you can filter anything a typical Bitquery can filter.

One such operation is project. This was not supported in Bitsocket 1.0. But with Bitsocket, you can easily listen to a subset of each transaction object tree:

{
"v": 3,
"q": {
"find": { "out.tape.cell.s": "19dbzMDDg4jZ4pvYzLb291nT8uCqDa61zH" },
"project": { "out.tape.cell.s": 1 }
}
}

Additionally, the r part of the query language has greatly improved. Instead of putting everything in memory, now everything is streaming based. The JQ engine that powers the r.f is now powered by StreamJQ. So the events are literally streamed live from MongoDB to JQ to the HTTP response directly to the client.

{
"v": 3,
"q": {
"find": { "out.tape.cell.s": "19dbzMDDg4jZ4pvYzLb291nT8uCqDa61zH" },
"project": { "out.tape.cell.s": 1 }
},
"r": {
"f": "[.[] | .out[0].tape[1].cell[2].s | fromjson]"
}
}

5. Efficient and Fast Delivery

In Bitsocket 1.0, every atomic event was a single item since all events were individually processed and emitted as they came in from the Bitcoin peer network. Basically, every new transaction created a new event processing task which was pushed to the global job queue. This resulted in the job queue overload. Eventually the job queue will never be able to catch up with the speed of incoming transactions whenever the transaction volume increases.

With Bitsocket 2.0, we have no such inefficiency because there is a buffer-the planaria event database.

The events come from the Planaria event database which is constantly indexing the entire Bitcoin transaction event universe in realtime. Because of this buffer (event database), all events can be batch processed and emitted in chunks. For example, if there were 10,000 relevant event items within the last 1 second, instead of having to create 10,000 separate atomic tasks which process and emit 10,000 individual events, now Bitsocket only needs to run a single task which makes a single roundtrip to the database and triggers a single event made up of 10,000 items. It would be 10,000 times more efficient and faster.

In the following example, you will notice the events come in in chunks: Each atomic unit of events is an array of multiple transaction items:

6. Reliable Scalable Infrastructure

The most important focus of Bitsocket 2.0 is reliability. As mentioned above, until recently, Planaria has suffered from the lack of reliability guarantee, because I didn’t have enough time and resources to build new tools as well as maintain existing ones.

But today, we are ready to move forward with more reliable infrastructure. The launch of Bitsocket 2.0 beta is the first step to this goal.

Bitsocket 2.0 no longer runs from a single server. We have implemented a scalable architecture which runs as a cluster of replicated and load balanced containers which can dynamically meet the usage demand. Of course this is the first release so it is not yet perfect, but we are starting with the right architecture which is fundamentally more scalable, reliable, and maintainable than before.

Also we are dedicated to making sure Bitsocket delivers realtime events lightning fast not just to a certain local region, but globally around the world.

Please try out Bitsocket wherever you are located and let us know (on Twitter or Atlantis slack) if it’s not fast enough. We will make sure to guarantee speedy delivery around the world.

Why this matters

1. Trust

Bitsocket 2.0 is a big deal because we have redesigned it from scratch to provide reliability both in terms of architecture and infrastructure. Of course the new event database driven architecture is very important, but what’s more important is our dedication to the highly available, distributed load balanced infrastructure which will power everything.

Today’s Bitsocket launch is only the first step. The rest of the Planaria API family will follow soon.

Users can finally TRUST Bitsocket. There’s a big difference between not being able to fully trust a push notification service and being able to. Even if there’s a 1% chance that a push notification event source may miss relevant events for you, you may want to keep polling so that you won’t miss any event, which beats the purpose of push notifications in the first place.

Even though Bitsocket 1.0 had great potential, it has been only used in casual ways.

Bitsocket 2.0 was designed from scratch to power fault tolerant production apps.

2. A Building Block for Bitcoin Event Driven Architecture

Also, because Bitsocket is now designed to be fault tolerant, we can dogfood Bitsocket ourselves to power other infrastructure APIs going forward.

For example, we are working on a completely rewritten version of Bitbus which will be powered by Bitsocket 2.0. When this is completed, the new Bitbus will be significantly more efficient than the current Bitbus architecture. Bitbus will use up much less memory and not waste any network traffic and local processing power.

Conclusion

If you would like to start using Bitsocket today, please don’t forget to join the #bitsocket channel on Atlantis slack:

Also, you may want to learn how Bitquery works first:

Lastly, here’s the new website for Bitsocket. You can learn all about Bitsocket here:

--

--