Estimating Batch Job Size in CosmosDB

Feb 1, 2018 19:38 · 450 words · 3 minutes read cosmosdb

CosmosDB imposes a maximum RU/s cap on all queries hitting the same collection (when using the MongoDB API). If you’re doing batch operations, this can mean that a single batch operation doing enough inserts/updates against a collection can cause 429 errors to be returned due to the request rate being too large. To work around this, you can size your batches such that you can greatly reduce the chance of encountering this error in practice.

The first step is to calculate the cost in RU of a single operation. If you are doing a batch insert for example, construct a sample document that matches the general shape of the documents you plan to insert in batches. Then run your insert:

db.collection.insert({ ... })
db.runCommand({getLastRequestStatistics: 1})

Which will return a result containing

    "RequestCharge": 10

This is how many RUs a single document will take. Next you want to see how much latency is involved in transacting against CosmosDB. Measure the latency from the server your application will be running on (or one in the same network)


I’m not sure if different latency numbers will be returned with different operations, so you may want to experiment a bit with this.

The formula for how many RU/s are consumed for a given batch operation is calculated by:

RU/s = (RU_of_single_operation * batch_size) / latency

Thus if you are inserting a small document and it takes 10 RU, with a batch size of 100 and a latency of 200ms, the total RU/s will be:

RU/s = (10 * 100) / .2 = 5000

This doesn’t mean that you will need to provision 5000 RU/s in order to make this batch succeed however. If this was the only operation running at the time, you would only need 1000 RU/s provisioned for the operation to succeed (because this was the only operation taking up RU).

From this you can also calculate batch size. If you have a collection with 10000 RU/s, latency of 100 ms and your documents take 2.5 RU each to insert, what Batch size should you have to ensure it always succeeds?

batch_size = (RU/s * latency) / RU_of_single_operation = 10000 * .1 / 2.5 = 400

Again if you know that this will be the only operation running at the time, you can leave out the latency calculation and chose a larger batch size.

In practice, the numbers you can calculate with this formula are conservative. The optimal batch size for your application depends on your data access pattern, so spend some time monitoring how many RU/s you consume under normal operation to see if you need to tweak the size of your batches higher or lower.