Yjs Fundamentals — Part 2: Sync & Awareness

Published in

Dovetail Engineering

6 min readDec 14, 2022

This is Part 2 of our Yjs Fundamentals series. See Part 1 of the series here: Yjs Fundamentals — Part 1: Theory

Yjs creates a real-time collaborative editing experience by synchronizing documents between clients and broadcasting users’ presence on a document (‘awareness’). In Part 2 of this series, I’ll explain how these two processes work. Sync and awareness each have their own protocol, and understanding the protocols is essential to implementing Yjs and debugging issues.

The document sync protocol

Recall from Part 1 of this series that documents are ephemeral, and edits are first-class citizens. If a client has a complete set of edits, it is always able to assemble them into the same resulting document, no matter what order it receives the edits.

Therefore, the sync protocol’s goal is to ensure that all clients eventually have all the edits relating to a document. It doesn’t need to ensure that they are received in any order.

While there is a sync protocol for peer-to-peer client models, this blog post will focus on the sync protocol for the client-server model, which we use. Clients connect to the server over a WebSocket connection. Upon connection, the client and server send each other edits that the other doesn’t have (the ‘initial sync’). Whenever a client generates a new edit, it sends it to the server, which then broadcasts it to all clients connected to the same document.

This is another area where Yjs, and CRDTs generally, shine. The server can be ‘dumb’, and just store and broadcast edits. It does not need to process each incoming edit. It rarely needs to do any compute-intensive operations, which makes it very scalable. This is a major advantage over the main alternative to CRDTs, Operational Transforms (OTs), which require the server to be smart and play an active role in processing each incoming edit.

Our other real-time collaboration technology, ProseMirror, rejects out-of-order edits from collaborators, and requires them to re-sync and recalculate the edit so that it can be applied to the latest version of the document.

There are three message types in the sync protocol, and each one can be sent by either the client or the server: SyncStep1, SyncStep2, and Update. The first two only occur to facilitate the initial sync, and Update messages are then exchanged when there are any further updates that need to be shared.

An overview of the **initial sync** and **update** steps

Initial sync phase

The initial sync process runs after a WebSocket connection is established, and ensures that any edits which were created by the client or added to the server while they were not connected are shared. It runs twice in two rounds — see the sequence diagram above.

SyncStep1: “Please send me edits that you have but I don’t. I have inserts up to this point in time.” The message data is a state vector, which efficiently encodes what range of inserts the sending party has.

How does a state vector work? Recall that each insert is identified by ID { clientId, clock }. clock is an auto-incrementing integer that belongs to each client. A state vector is a map of clientId ⇒ clock which is the most recent insert.ID.clock this client has per clientId. This is much more efficient than broadcasting the entire list of insert IDs that a client has.

SyncStep2: “Here are the edits that I have but you don’t.” The sending party filters its set of inserts for any inserts that are newer than the clock for each clientId and responds with the full data of the inserts that the other party doesn’t have. It also sends all deletes, because there is no cheap way to communicate what deletes a party has in SyncStep1. When the server is the receiving party, it will therefore record duplicated deletes every time there is a new client connection. These are de-duplicated by squashing (explained later).

After the initial sync

After the initial sync, all further edits are sent through Update messages. These contain the same type of content as SyncStep2 messages but are just used after the initial sync for sharing new edits generated by a client to the server, and then from the server to all clients.

Why is SyncStep1 → SyncStep2 repeated?

The first round of SyncStep1 → SyncStep2 shares edits from the server to the client, which the client doesn’t have. At the end of this round, the client has all the edits the server has, plus any edits only the client has. In the second round, the client sends edits to the server that the server doesn’t have. At this point, the client and server are in complete sync — they each have the same set of edits.

Here is an example client code snippet to show how Yjs abstracts away the sync protocol when writing client code, and how to listen for new edits.

import * as Y from "yjs";
import { WebsocketProvider } from "y-websocket";

const doc = new Y.Doc();

// The provider takes care of broadcasting and receiving sync and awareness.
// This will automatically perform the initial sync.
const provider = new WebsocketProvider(
  "wss://dovetailapp.com/api/yjs",
  "00000000-0000-0000-0000-000000000000", // docId
  doc
);

// Create a top level data type
const yarray = ydoc.getArray("myList");

// Create a new edit
yarray.insert(0, [1]);
// `provider` will automatically send the change to the server

// Listen to changes to this part of the document from any client
yarray.observe(changeEvent => {
  // See what has changed
  changeEvent.changes.added; // Set<Y.Item>
  changeEvent.changes.deleted; // Set<Y.Item>
});

Awareness

‘Awareness’ is a feature that allows clients to communicate things like their current cursor position and presence on a document, without encoding that information in the document. Awareness events are sent from each client to the server every 30s by default. Other clients mark it as offline if an awareness message is not received from a client in the last 30s.

Awareness doesn’t need to use the same data structure as document edits since we don’t need to persist old awareness states for each client, which means conflict resolution reduces down to ‘last write wins.’ Each awareness event relating to a client just overwrites the previous event data. The awareness data structure for a document is a map of client_id ⇒ JSON. The JSON can contain any info you like, such as cursor position, selection set, and application data like user_id.

How awareness messages are transmitted between clients and the server

Here is some example client code which shows how you can listen for awareness changes on a document, and set the current client’s awareness state.

import * as Y from "yjs";
import { WebsocketProvider } from "y-websocket";

const doc = new Y.Doc();

// The provider takes care of broadcasting and receiving sync and awareness
const provider = new WebsocketProvider(
  "wss://dovetailapp.com/api/yjs",
  "00000000-0000-0000-0000-000000000000", // docId
  doc
);

// Listens for awareness changes from other clients
provider.awareness.on('change', changes => {
  // The new awareness state changes
  changes;

  // Complete map of awareness states, in the form of `clientId => JSON`
  provider.awareness.getStates();
})

// Sets the awareness state of the current client, which provider will send to the server
provider.awareness.setLocalState({
 userId: "00000000-0000-0000-0000-000000000000",
 cursorPosition: { x: 100, y: 200 },
  selectedNoteIds: [
    "00000000-0000-0000-0000-000000000000",
    "11111111-1111-1111-1111-111111111111",
    "22222222-2222-2222-2222-222222222222",
  ],
});

What’s next?

Stay tuned for Part 3 which covers our server-side Yjs architecture and the optimizations we’ve made.