Understanding CosmosDB Continuation Tokens (HasMoreResults) and ConnectionPolicy RequestTimeouts

4 min readNov 26, 2018

Continuation Tokens

The CosmosDB client libraries utilize a continuation strategy when managing the results returning from queries. Every query submitted to CosmosDB will have a MaxItemCount limit attribute in the request header. The default limit is 100; however, any Int32 value can be used. This value is not the definitive ceiling, and other constraints can come into effect and reduce the ItemCount in a more dynamic manner. We will come back to this point later. Requests exceeding the MaxItemCount response constraint are paginated. The first page is returned to the client with a partial result and response headers. Within the response headers, a continuation token (“x-ms-continuation”)is present indicating that only a partial result was returned and that more records are available on request.

Example continuation token and record count.

Further pages of records can then be retrieved by supplying the continuation token in subsequent calls. The coordination between the CosmosDB service and the client is taken care of behind the scenes by the SDKs. All the developer needs to do is check the HasMoreResults boolean on the DocumentClient object. If HasMoreResults is true, there is an opportunity to get more records. Calling the ExecuteNext method will retrieve the next page of results.

For example, a very simple code example might look something like this…

public async static void RequestDocuments<T>(
    DocumentClient client,
    string collectionLink,
    string query)
{
    using (var queryable = client.CreateDocumentQuery<T>(
        collectionLink, query,
        new FeedOptions() ).AsDocumentQuery())
    {
        while (queryable.HasMoreResults)
        {
            foreach (T item in 
                await queryable.ExecuteNextAsync<T>())
            {
                // Iterate through results.
            }
        }
    }
}

RequestTimeout

The RequestTimeout property on the CosmosDB SDKs ConnectionPolicy class can be miss-leading. If you’re coming from an RDBMS background, it’s easily confused with the CommandTimeout property found in classes like SqlCommand. The expectation of any timeout parameter is to terminate an activity after a defined period has elapsed. Setting the CommandTimeout property on the SqlCommand class means that a T-SQL query sent to the server must complete in the set period or the activity is terminated and an exception is raised. ConnectionPolicy.RequestTimeout does implement the same time limit logic, but the CosmosDB client handles database interaction in a very different manner when compared to traditional RDBMS SDKs. With RDBMS the query author has virtually no limitations. It’s possible to query huge amounts of data and consume vast amounts of system resources without consideration for any other uses on the system.

CosmosDB is different; client libraries expose numerous query methods and these methods implement a continuation strategy. The idea of sending a query, with unlimited compute consumption, to the data store and returning unbounded sizes of datasets has been abolished. CosmosDB has been designed from the ground up with massive data volumes and predictable performance in mind. So there are some in-movable but configurable caps and limits in play when using the client libraries. A combination of various caps and limits mean that returning results have finite boundaries.

Now we have an understanding of the continuation mechanism we can fully understand how the RequestTimeout constraint works. The timeout is not applicable to the gross compute time of the query but applies to the partial compute time. The query is broken down into batches of resultsets using continuation tokens and its those individual partial result sets that are governed by the request timeout. To put it another way, for each call to ExecuteNextAsync() the timer is reset.

For example, if my MaxItemCount is 100, RequestTimeout is 1 second and my query returned 500 documents. The initial call would return 100 documents and a continuation token. I would then need to make four more ExecuteNextAsync() executions to return the complete result set. If each request executed in 500ms (I have a slow but consistent network) then my gross request time would be 2.5 seconds and I would not receive a RequestTimeout as each component part of the total request executed in less than 1 second.

Long Running Queries

With this new knowledge in mind, you can appreciate that it’s possible to have long-running queries. If the query returns a large amount of data, many requests will be executed in the continuation loop and the total time will add up.

Query Request Timeout

It is possible to implement some defensive coding to mitigate the risk of a rogue query running away and consuming excessive RUs. This constraining behaviour can be achieved by adding a simple timeout to the paging loop.

For example

int TotalRequestTimeout_ms = 10000; // 10 seconds
int TotalRequestUnit_budget = 100; // 100 RUStopwatch sw = Stopwatch.StartNew();
double requestUnitsConsumed = 0;
int requestCount =0;while (prices.HasMoreResults)
{
    response = await prices.ExecuteNextAsync<PricePoint>();
    requestUnitsConsumed += response.RequestCharge;
    requestCount += response.Count;
    foreach (PricePoint p in response)
    {
        pp.Add(p);
    }
    if (sw.ElapsedMilliseconds > TotalRequestTimeout_ms) throw new Exception("TotalRequest time out!");if (requestUnitsConsumed > TotalRequestUnit_budget) throw new Exception("TotalRequestUnits budget breached!");
}

The code used here is to simply illustrate the concept and is not a coding recommendation.

Other Constraints

Previously I mentioned that there are other constraints that can affect the number documents returned by a query. I’ll cover them briefly here.

RU Limit

If a request call is made via the SDK after the target CosmosDB collection has exceeded the RU capacity provisioned the request will be terminated and an exception is thrown with an HTTP 429 status code.

*4MB Packet SIze Limit

The response dataset payload is restricted to 4MB in size. If I set the MaxItemCount to something big like 10000 and my document size is k kilobytes then (k * number-of-documents) must be less than 4096 Kb (4 MB). As an example, imagine your average document size is 0.6kb then you could only fit 6990 documents into a 4MB chunk. Even if I set my MaxItemCount to 10000 it would still only receive 6990 documents per request.