DocumentDB — retrieving a random document
As of this writing, DocumentDB does not support retrieving random documents from a collection. Therefore, I created a DocumentDB stored procedure (sproc) to extend the database and implement this functionality. As you know, DocDB allows you to define javascript server-side code to implement User-Defined Functions, Triggers or Stored Procedures.
This sproc is a derived work of the Count.js sample that was provided by the Microsoft Engineering Team, and is made public under MIT license here. I also kept the comments from the original count.js file, so you can undertstand how to procedure works and how to tweak it to your needs. Here’s the source:
A couple of important points to take into consideration:
- This could potentially iterate through your whole collection, depending on the value of filterString you pass to the sproc upon execution. You should make sure filterString is as lightweight on the database as possible. For example, if you need a random post from your entire collection, use a filter that will only return a unique ID for the random document. Then you will use that unique ID to query the database again and get the whole document:
SELECT p.id FROM p
- The execution of the sproc might return before completion. This can be caused due to a timeout on the database, or because you reach the max number of elements allowed in the procedure (defined in the variable maxResult). If this does happen, then the sproc will return a random post and a continuation token. If you do receive a continuation token, you should save the result and re-run the sproc with the continuation token (for example, use an array in your app to aggregate all results). Once the sproc returns without a continuation token, you know that it has iterated through your collection (or filter), and you can then select a random index from the array where you’ve been collecting the results.
- The code above consolidates the sproc within an object literal and uses modules.exports. In this way, you can import this file into a routine that checks for existence of the stored procedure in the database and creates the sproc in your collection if it is missing. If you just want the pure source of the sproc, when copy/paste the code of the function getRandomDocument
If you have feedback or improvement suggestions for this code, comment below or ping me on Twitter. DocumentDB is an awesome, fully-managed, document-based database within Microsoft Azure, so be sure to follow the engineering team over there as well.