Keluro uses extensively DocumentDB for data persistence. However, it’s extensive scaling capabilities come with a price. Indeed with your queries or your commands you may exceed the amout of request unit you are granted. In that case you will received a 429 error “Request rate too large” or a “DocumentException” if you use the .NET SDK. It is your responsability then to implement the retry policies to avoid such a failure and wait the proper amout of time before retrying.

Edit: look at the comment below. The release v1.8.0 of the .NET SDK proposes some settings options for these retry policies.

Some samples are provided by Microsoft on how to handle this 429 “Request too large error”, but they are concerning only commands, such as inserting or deleting a document, there is no sample on own to implement the retry policies for common queries. A Nuget package is also available: “Microsoft.Azure.Documents.Client.TransientFaultHandling” but even if integrating it is as quick as an eye blink, there is no logging capabilities. In my case it did not really resolve my exceeding RU problem, I even doubt that I made it work and the code is not opensource. Then, I decided to integrate the ideas from the samples in own utility classes on top of the DocumentDB .NET SDK.

The idea is similar to “TransientFaultHandling” package: to wrap the DocumentClient inside another class exposed only through an interface. By all accounts, it is a good thing to abstract the DocumentClient behind an interface for testability purposes. In our case this interface is named IDocumentClientWrapped.

Instead of returning an Queryable</em> instance as DocumentClient would do we return an IRetryQueryable</em>. This latter type, whose definition will follow, is also a wrapper on the IQueryable</em> instance returned by the DocumentDB client. However, this interface explicitely retries when the enumeration fails because of 429 request too large exception raised by the database engine, DocumentDB in our case.

In this interface we only expose the method extension methods that are actually supported by the “real” IQueryable</em> instance returned by DocumentDB: Select, SelectMany, Where etc. For example, at the time of the writing GroupBy is not supported. You would get an runtime exception if you used it directly on the IQueryable</em> instance returned by DocumentClient.

Now look at how we use these interfaces in the calling code.

Independently of the retry policies classes and the “request too large errors”, let me emphasis that LINQ can be tricky here. Indeed, the piece of code above is completly different from this one:

In this case the Where constraint is perfomed “in memory” by your .NET application server. It means that you have fetched all data from DocumentDB to your app server. If MyType contains a lot of data, then all have been transfered from DocumentDB to your application server and/or if the Where constraint filters a lot of documents you will probably have a bottleneck.

Let us get back to our problem. Now that we saw that having retry policy for a query means only calling AsRetryEnumerable() instead of AsEnumerable() let us jump to the implementation of thoses classes.

The idea is to use an IEnumerator that “retries” and use two utility method: ExecuteWithRetry,ExecuteWithRetryAsync. The former one for basic mono threaded calls while the latter is for the async/await context. Most of this code is verbose because it is only wrapping implementation. I hope it will be helpful for others.