Limits and reservations #
Automating (rate-)limiting at the platform-level is often impossible as traffic patterns and acceptable use-cases are unknown.
Applications which are idle require less resources than busy ones. Technology choices also play an important role: the baseline resource usage of an nginx for a static website is much lower than the average Tomcat.
Containerized applications make it easy to pack application tightly which can lead to overloaded nodes. Therefore, in order to keep applications “in check” and to avoid services and stacks impacting each other, resource limits (and reservations) can be applied.
reservations work on the service level, consider the following
In detail, the above example:
The above enforces a limit of half a CPU core and ~512 MB of RAM. If the service tries to allocate more, it’ll be killed (and restarted) by Docker. The notion of killing an application sounds drastic, but it’s really a forceful
The advantage of using
limits is the following:
- Together with an alert on killed containers, we gain operational visibility (observability) and can investigate why the application (running in the container) was trying to allocate more resources than expected.
- All other services on the cluster are still operating - running and responding. Outages can be minimized or even completely mitigated.
limits provide the upper boundary (or worst-case),
reservations are the opposite: they provide the minimum required to start the application. Providing said minimum can be interpreted as an “fail early”-approach.
The above example means that the service needs at least 0.1 CPU cores and ~256 MB of RAM to start.
The advantages are as follows:
- Knowing that the application requires a certain baseline of resources helps Docker scheduling it on a node where these resources are available.
- A(n early) failure to deploy a service gives us immediate feedback about resource availability.