Azure Cosmos DB Caching Tips

Derek Comartin
Feb 23, 2017 · 2 min read

I’ve started to use Azure Cosmos DB a bit more over the last couple weeks and I’m really enjoying it. The first real world scenario that I hit was needing to implement optimistic concurrency.

This led me straight into I discovered two caching optimizations you can make for better performance accessing individual documents.

Caching SelfLinks

If you are using the .NET SDK, each document contains a unique SelfLink property. This is represented by the _self property in the JSON.

https://gist.github.com/dcomartin/76b43d9395ecd53fb4f80cf1e7c55d1f

Because the SelfLink is immutable we can cache it and and then use it to access the associated document.

It is more efficient to access the document directly via the SelfLink rather than querying the collection and filtering by Id.

ETags

Each document within Azure Cosmos DB also has has an ETag Property. This is the _etag in the json document or when you are using the .NET SDK as the ETag on your Document.

The ETag or entity tag is part of HTTP, the protocol for the World Wide Web. It is one of several mechanisms that HTTP provides for web cache validation, which allows a client to make conditional requests.

You may be familiar with ETag’s related caching. A typical scenario is a user makes an HTTP request to the server for a specific resource. The server will return the response along with an ETag in the response header. The client then caches the response along with the associated ETag.

ETag: "686897696a7c876b7e"

If they client then makes another request to the same resource, it will pass a If-Non-Match header with the ETag it received.

If-None-Match: "686897696a7c876b7e"

If the resource has not changed and the ETag represents the current version, then the server will return a 304 Not modified status. If the resource has been modified, it will return the appropriate 200 status code along with the content new ETag header.

AccessCondition

Azure Cosmos DB uses ETags for handling caching exactly as you would expect for caching. We can store ETag when we retrieve our document and then subsequently use that ETag when we need to fetch the same document again. We can do this by creating an AccessCondition and specifying an IfNonMatch as the AccessConditionType when we call ReadDocumentAsync.

Cache Client

Putting it all together can look something like this. I’m using the MemoryCache to store our fetched documents. Since these documents contain the SelfLink we can make any other request to that document directly. Also with the ETag on the document, when we query the document directly, we can specify an If-None-Match for the server to return us a 304 Not Modified.

https://gist.github.com/dcomartin/2a0e0378d847360001442c20c12c6dbc

https://gist.github.com/dcomartin/59fb2cd63aacd553058fc5d096b4eea3

I’ve put together a small .NET Core sample with an XUnit test from above. All the source code for this series is available on GitHub.

Are you using Azure Cosmos DB? I’d love to hear your experiences so far along. Let me know on twitter or in the comments.

Follow @codeopinion

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade