Limits and reservations #

Automating (rate-)limiting at the platform-level is often impossible as traffic patterns and acceptable use-cases are unknown.

Applications which are idle require less resources than busy ones. Technology choices also play an important role: the baseline resource usage of an nginx for a static website is much lower than the average Tomcat.

Containerized applications make it easy to pack application tightly which can lead to overloaded nodes. Therefore, in order to keep applications "in check" and to avoid services and stacks impacting each other, resource limits (and reservations) can be applied.

The limits and reservations work on the service level, consider the following example service:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
version: 3.8

services:
  example:
    image: project/example:1.0
    deploy:
      replica: 1
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
        reservations:
          cpus: '0.1'
          memory: 256M

Details #

Limits #

In detail, the above example:

 8
 9
10
11
  resources:
    limits:
      cpus: '0.5'
      memory: 512M

The above enforces a limit of half a CPU core and ~512 MB of RAM. If the service tries to allocate more, it’ll be killed (and restarted) by Docker. The notion of killing an application sounds drastic, but it’s really a forceful stop and start.

The advantage of using limits is the following:

Together with an alert on killed containers, we gain operational visibility (observability) and can investigate why the application (running in the container) was trying to allocate more resources than expected.
All other services on the cluster are still operating - running and responding. Outages can be minimized or even completely mitigated.

Reservations #

While limits provide the upper boundary (or worst-case), reservations are the opposite: they provide the minimum required to start the application. Providing said minimum can be interpreted as an “fail early”-approach.

12
13
14
reservations:
  cpus: '0.1'
  memory: 256M

The above example means that the service needs at least 0.1 CPU cores and ~256 MB of RAM to start.

The advantages are as follows:

Knowing that the application requires a certain baseline of resources helps Docker scheduling it on a node where these resources are available.
A(n early) failure to deploy a service gives us immediate feedback about resource availability.

Determine the needs #

To determine the needs of an application, we suggest our Metrics Dashboard (integrated in our console) or our supported use of Grafana. Feel free to get in touch for questions and assistance.