{{'2017-09-28T22:10:05.1509034Z' | utcToLocalDate }}
James Duggan

CosmosDb, know your costs, and remember…

This will be a short post to emphasize a simple point, yet one that should make an enormous difference to how you approach configuring a CosmosDb collection and modelling documents to support read and write requirements.

Know your costs

I cannot emphasize this point enough.

The folks at Microsoft have made this really easy, be it via the Request Units (RU) and Data Storage calculator, the collection Query Explorer through the Azure Portal or a REST client such as Postman coupled with the really useful library and samples by a Microsoftie over on git documentdb postman collection.

Let’s be very clear about what this means. Armed with a JSON document, without writing or compiling a line of code, you can determine the RU cost of CRUD statements for that document. If we add in throughput then we can arrive at some reasonable cost estimates. However, since throughput is not readily available in the early stages of design, what use can be made of CRUD RU costs? Let’s take a look at an extract of RU costs I’ve recently been working on.

CosmosDb RU costs

Here (using Postman) I have been looking at the costs of two document types for a fixed 10Gb single partition collection configured as full, lazy and no indexing. Basically upper (full indexing) and lower (no indexing) bounds with a, what is lazy indexing question thrown in. Here is a summary of some points I took away:

  • Full indexing comes at a cost. Do I need it?
  • Reading by the document id is the cheapest way to return a document
  • Querying by secondary indexes are expensive. Can I avoid them?
  • There are built in smarts to support a query when the id is referenced. Even when secondary indexing is disabled.
  • Update costs are expensive. This makes sense, as partial document update is not supported. The question then is if the update is really necessary?
  • Delete costs are expensive. Again is it necessary? For example, can it be implemented for free by setting TTL?
  • Upsert is syntactic sugar, in other words there is no additional cost for using it. 

And remember…

Remember what? It has been covered, albeit not clearly stated.

Not all documents are created equal

Yes you read that right. If you refer back to the extract above you will see the whopping RU cost associated with creating Document 2. Take note of the fact it has a significantly smaller document size than Document 1, yet incurs significantly higher write cost.

This is not hidden, in fact the header of the Request Units (RU) and Data Storage calculator clearly states

“…request unit consumption varies by operation and JSON document…

At the end of the day all this really means is, we have a new toy to play with and all the same concerns to be concerned about.

comments powered by Disqus