Playing with Change Streams

What are Change Streams?

Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog.

If you are not familiar with it, please read the introductory blog post about it here : An introduction to change stream.

Impractical idea: making MongoDB relational, but let’s still explore

Below there are two documents. One is from the users collection and another is from posts collections. If you notice, we are duplicating users data into the posts collection.

If we update username in users collection, we need to propagate that change to the posts collection.

To keep things simple we are going to refer to the users collection as a source-collection and the posts collection as a destination collection.

Also, _id is the source field and author._id as a destination field.

And if we update a source collection we also have to propagate those changes to the destination collection(s).

Example schema

For this exploration let’s consider the following collections.





In above documents data is stored in a normalized data model among various collections.

Now let’s consider the following two scenarios,

  1. Modify a username
i.e. `oreo` TO `ordeal`

We would probably perform the following update queries.

2) Modify a tag

i.e. 'time-travel' TO 'Time Travel'

We would probably perform the following update queries.

In both scenarios, it takes multiple queries to propagate the changes.

Using change streams we could abstract out the queries which runs on destination collections from our business logic.

A Change Streams could be opened against any collection and receives all the changes happening to that collection in real time.

A Change Stream notifies on the following 5 kind of events : (insert, update, delete, replace, invalid)

So, if we open a change stream against the users collection we could receive all the update occurring in real time. And we could filter those updates for specific fields like username and if any change matches a criteria we could propagate those changes to their respective collections.

  • Open a change stream on source collection.
  • Listen for changes on source collection for specific field.
  • Propagate the changes to the destination collection(s).

For our 1st scenario (Modify a username)

Destination collections would be posts and comments

Source field would be username

Destination field would be author.username

For our 2nd scenario (Modify a tag)

Destination collections would be posts

Source field would be title

Destination field would be tag.title

What we did here is that we abstracted out the update queries, which were propagating the changes to destination collections. We move them from business logic to the change stream.

I have abstracted this one step further and produced a tiny library.

Suggestions and feedback are welcome.

Thank you Linda for proofreading.