Phoenix Presence for social networks

Phoenix Presence is awesome! It makes tracking present users (or any other connections or services) easy and reliable. If you haven’t checked it out yet, you should:

But how do you track the connection state of multiple users in a social network, where we require scalability and user privacy and where each users social network is different from the others?

I tried to find out. This article describes a simple solution, demonstrating such a solution using the standard Phoenix PubSub and Presence framework.

Social networks are not (just) chat rooms

The official examples, and most online tutorials, uses the concept of chat rooms when demonstrating the features of Phoenix Presence. Setting up a room based chat system that tracks connected users in each room is incredibly easy with Phoenix.

However, the room based approach is not a good fit for most social networks. You are not connected to a few groups/rooms where each users is visible to everyone else. Instead, you should only be notified of the connection status of your social circle which is not shared with any other user.

At first, I thought of a few naive solutions using the room based approach.

A single presence “room”

You could connect all users to the same room, and sync their presence state to the clients while only displaying their friends. This would however be a bad solution for a few reasons:

  • It would not scale well. Imaging having tens of thousands of connected users. Making all those clients sync the full state would require quite a lot of network and computing resources.
  • It’s important that each user receives data for each of their friends, but not for any user that is not their friend. Letting all users subscribe to all other users presence state would be quite a privacy issue.

Intercept and filter

You could create a single room that all users are connected to, while using intercept to filter out the presence_diff events that the user should not receive. You would also have to filter the initial presence state before pushing it to the client. While this would preserve user privacy and use minimal network resources, it would not scale well server side:

  • You have to intercept each presence diff, which is slower than using the default fastlane.
  • The PubSub layer would have to send a message to every connected user’s process, even though very few of them would actually use the message. This would scale quadratically in relation to the number of users.

A room for each friendship

You could also create a separate friendship room for any pair of friends, and join each user to the N rooms required to match their social network. This would preserve the privacy and scaling issues of transmitting all presence data to all users. But it’s also a no go for a few reasons:

  • It would require a more complicated setup on the client side, with a N Channel instances that need to be initialised and handled.
  • Since each phoenix channel is a process on the server side, you would en up with N processes per user.
  • For each present user, you would have to track their presence across N topics, which would have to be synced to and stored on all connected nodes putting N times more load on your servers than necessary.

Presence tracking without chat rooms

We have to separate the presence state and the Phoenix Channels used to transmit it, in order to track your social circle in a scalable and privacy aware manner. To do this, we’ve got to understand the basics of how Phoenix Presence works and how it’s related to the PubSub layer.

Understanding Phoenix.Tracker

At the core of Phoenix Presence, there is Phoenix.Tracker. It is what actually tracks and synchronises presence data across the cluster. Any given process may be tracked with a given topic and key (and any arbitrary metadata). The topic may be any string, and neither the topic nor the process has to have any connection to Phoenix Channels.

Phoenix.Tracker is a behaviour, and implementing modules must can use the handle_diff callback to be notified of state changes.

Phoenix.Presence behaviour

The Phoenix.Presence behaviour, which is what most Phoenix apps will use, is mostly a wrapper around Phoenix.Tracker. It provides a cleaner API for usage with Phoenix Channels, well as some process and supervisor handling for the tracker and it’s Tasks.

It also implements the Phoenix.Tracker.handle_diff callback. It’s implementation broadcasts the diff using Phoenix.PubSub to the same topic as the presence topic as the presence_diff event. This is important, since this means that we can use the presence topic to control where diff events are sent.

At first, I assumed that Presence.track was involved in adding some subscription to the presence events. But that is not the case, simply joining a Channel with a given topic will mean that you will receive presence_diff events. The “track” in the function name does not imply that you wish to track something/someone else, but that you wish for the presence framework to track you.

The Phoenix.Presence track, untrack and update functions all take a Phoenix.Socket struct as their first argument. They do, however, all have an alternate form where the socket struct is replace with a pid and a topic. When using the socket as the first argument, the pid and topic is simply the same as those of the channel socket. You may call them with any pid and topic you like.

Tying it all together: A working example

I’ve created a phoenix_social_presence demo on GitHub. It’s a basic Phoenix project, which simply gives each user an ID and uses that to connect to a private Phoenix Channel. Think of this channel as a topic that is only to be used by the corresponding user. It would be perfect for broadcasting notifications to the user, so each user should always be connected to their user channel.

Before social presence

In this revision, the channel module is set up like most other Phoenix Presence examples.

It has a join function, which verifies the user and sends an :after_join message to itself:

  def join("user:" <> user_id_str, payload, socket) do
if to_string(socket.assigns.user_id) == user_id_str do
send(self, :after_join)
{:ok, socket}
else
{:error, %{reason: "unauthorized"}}
end
end

The :after_join callback is just as the default generated Presence suggests. It fetches the current presence state (of the same topic as the socket) and pushes it to the client. It also tracks the current channel socket given it’s user ID.

  def handle_info(:after_join, socket) do
push socket, "presence_state", MyPresence.list(socket)
{:ok, _} = MyPresence.track(socket, socket.assigns.user_id, %{
online_at: inspect(System.system_time(:seconds))
})
{:noreply, socket}
end

Using this revision works, but since only a single user ID is allowed in to the topic, it will not distribute any presence state to the users friends.

Syncing presence state asymmetrically

In order to sync user data for a given set of friends, I wrote this fairly simple commit.

Instead of calling MyPresence.track socket, ... we’ll track the user on a separate topic:

  defp track_user_presence(user_id) do
{:ok, _} = MyPresence.track(self(), presence_topic(user_id), user_id, %{
online_at: inspect(System.system_time(:seconds))
})
end
  defp presence_topic(user_id) do
"user_presence:#{user_id}"
end

Note that we’re using track/4 instead of track/3 since we want to control the topic. We’re using self() for the pid (since it’s used from the channel process) and our presence_topic to reliably map a user to a presence topic. With this setup, each presence topic will only contain a single key, and each user will only be tracked on a single topic.

Next up, we have to retrieve the current presence state of all of the users friends. Since they are now distributed across multiple topics, we have to request the presence state of each presence topic:

    user_ids
|> Enum.map(&presence_topic/1)
|> Enum.uniq
|> Enum.map(&MyPresence.list/1)
|> Enum.reduce(%{}, fn map, acc -> Map.merge(acc, map) end)

user_ids is a list of the user ids that the current user should get presence updates for.

Each user is not connected to a Phoenix Channel for each presence topic that they should get updates for, so we have to make sure updates are pushed to their socket as well. We’ll do this by subscribing the current process to each relevant presence topic on the PubSub layer. I’ve wrapped this in the same pipeline as the initial presence state retrieval above:

  defp get_and_subscribe_presence_multi(socket, user_ids) do
user_ids
|> Enum.map(&presence_topic/1)
|> Enum.uniq
|> Enum.map(fn topic ->
:ok = Phoenix.PubSub.subscribe(
socket.pubsub_server,
topic,
fastlane: {socket.transport_pid, socket.serializer, []}
)
MyPresence.list(topic)
end)
|> Enum.reduce(%{}, fn map, acc -> Map.merge(acc, map) end)
end

The subscribe call above will subscribe the current process to events broadcasted to the given topic. The fastlane option tells Phoenix to send the events directly to the transport process (and only serialize it once, even when there are multiple subscribers). The last elements are any intercepts (topics that should not be fastlaned), which I set to the empty list for simplicity.

With the modifications I’ve shown so far, the initial state would be displayed by the client. Any updates will be pushed to the client, but since the client doesn’t have a channel for the given topic it would simply be ignored. Let’s make sure they’re handled through this modification in socket.js :

socket.onMessage(({topic, event, payload}) => {
if (event == "presence_diff" && /^user_presence:\d+$/.test(topic)){
handlePresenceDiff(payload)
}
})

The default Phoenix JS client has the socket.onMessage function. It will call the provided function with any message received from the server, with any topic. We’ll find the messages matching the “presence_diff” event and expected topic pattern, and handle them just as “presence_diff” events on the user’s channel itself.

Performance

While the example code would need some tuning to be production ready (related to general performance and possible race conditions), I think that this kind of setup would scale well.

  • Each user’s presence is tracked and stored only once. Since the full presence state has to be synced and stored on each node any duplications would be a possible scalability issue.
  • It uses the default PubSub mechanisms for transmitting messages. On each actual presence change, the diff will be handled and serialised once and distributed directly to each listening user’s socket.

Summary

Distributing presence state to a social network is both possible and surprisingly easy with Phoenix, once you understand how to wire the components together.

Check out the demo project.


Any questions, comments or suggestions? Please leave a response, reach out on Twitter or comment on the repo.


Looking for work? Come join me at Kundo, a Stockholm based SaaS-company building modern tools for digital customer service. We are looking for both junior and experienced developers for Python, Elixir and JS work. Swedish language skills required.