Sharing a socket connection between tabs
Every modern service should have stories and a chat. We decided to start by making a messenger in hh.ru. My name is Vladislove Korotun, I am a frontend developer. In this article, I will tell you how an unconventional approach to using Web Workers helped us to complete this task.
For those of you who prefer watching and listening, check this youtube link.
Preparation
For the chat to be as responsive as possible and not to spam the server with empty timer requests, we decided to make it on WebSocket. Also, we really wanted to have a dynamic new message counter on each page.
Here we faced our first problem. Our site users often open a huge number of tabs, and having an active socket connection on each of them is a very costly approach. In addition, simply updating the counter via http by timer on each tab created a large load on the server, even when the chat itself was populated by only 20% of the audience.
We went for search for new solutions and look how others deal with it. Some services open the chat in an entire new tab and set up a socket connection only there. But we wanted to have access to the server messages on all our tabs, so this approach didn’t work for us.
We also found an interesting article on how you can use SharedWorker for sharing of socket connections between the tabs available to the worker. But unfortunately, Safari has now dropped support for SharedWorker. And that’s a large percentage of our audience.
In our search for a suitable solution, we turned our attention to ServiceWorker. These workers are available in all tabs and frameworks of their Origin and support working with WebSocket. Our goal was to develop a system that would allow us to use a single socket connection in all of our web applications, including those with a different origin than the server itself.
The scheme is as follows: the socket server itself is located on the websocket.hh.ru subdomain. There we also have a Proxy page that installs and activates the worker. The page can be embedded into any of our sites using IFrame and proxy socket events to the parent window via PostMessage.
Thus, each of our applications that need sockets can simply embed this IFrame and use a special wrapper to access the message channel from the server. With this approach, there will only be one socket connection, and it will live in the ServiceWorker run by the Proxy page.
First difficulties
During development, everything worked like a charm. But when we submitted the task to testing, we found that 10–15 minutes after the page was launched, the counter stopped responding to updates.
It turned out that connection to the server was lost and was not restored until the page was reloaded. Here we had to go a little deeper and read the documentation more attentively.
It turns out that the main difference between ServiceWorkers and SharedWorkers is that the former doesn’t run permanently. They are launched to perform some specific tasks, and after some time the browser silently shuts them down until the next requests appear.
In our scheme, socket connections are only used in a one-way format. We catch server events, but we go for data via http. So the browser would decide that the worker was done and needed to be shut down. We decided to try to get around this limitation and tried sending empty messages to the worker using a timer. The experiment was a success.
As long as the worker was receiving requests from its clients, it continued to live and maintain an active socket connection. Even if the browser at some point decided that the worker was running too long and shut it down, the next ping message woke it up, and the socket connection was restored.
At this point, a second problem awaited us. Also during testing, we found that in a tab opened in a new incognito mode session, when the worker registers for the first time, the connection is established, but the event does not reach the tab. However, after refreshing the page, everything works as we expected.
The problem is hidden again in the ServiceWorker implementation. Since their main purpose is to create a caching layer and intercept http requests, for consistent page behavior when registering the worker, the current tab does not become its client, because it has already started working without using the worker. After refreshing the page, it loads already with the registered worker, and then the connection works.
To make the worker immediately catch the current tab, we have resorted to the built-in method self.clients.claim(), which we also read about in the documentation. This method allows us to make all tabs in the worker’s scope to become its clients.
If there is no support for workers in the browser, our proxy page switches to fallback mode and establishes a direct socket connection without a proxy. Currently, 3% of our active connections works in fallback mode.
Optimizing counter requests
One socket connection is fine, but requests for a new counter still go via http. Since in our socket implementation, information exchange takes place in a one-way format, the counter itself must be requested via http. And at some point each tab will try to update it via http. Here again, ServiceWorker comes to the rescue.
We have developed a separate worker, linked to the chat domain this time, and a proxy-page to access it. The counter proxy page builds in connection proxy-page and catches its updates. When the counter needs to be updated, a request gets into worker, where debounce takes place. A request for a new counter will happen once and for all tabs at a certain time. The received value will be proxied to all clients — proxy pages of this Worker, and they will transfer the new value into parent app, where we can write a new digit.
As a result, the load on the unread message counter has decreased by 40 times. But this solution has its disadvantages as well. Because we’re cheating the browser in some way, one day that loophole might be sealed, and we’ll have to look for a new solution.
Summary
At the moment, this system has been running in production for almost a year and does not create any problems either for users or developers. Each of our applications on any domain has the ability to access a single socket connection and receive events from the server in real time.
I’m sure we’ll find even more uses for this technology in the future, and we’ll delight you with new features that work at blazing speed.
That’s all I have to say. Let us know how you’ve used the Workers and what approaches you’ve used to solve similar problems. It will be interesting for us to learn new things from you.