visit
is the container orchestrator by far. Much of its success comes from its reliability. All software has bugs. Kubernetes is somehow less buggy than alternatives when it comes to running your containers.Disclosure: , the developer marketplace, has previously sponsored Hacker Noon.
Kubernetes eventually arrives at your desired number of running containers, in time. It unrelentingly keeps that number running. The refers to this as Kubernetes being self-healing. This behavior comes from a core philosophy in the design of Kubernetes.
“The goal seeking behavior of the control loop is very stable. This has been proven in Kubernetes where we have had bugs that have gone unnoticed because the control loop is fundamentally stable and will correct itself over time.
If you are edge triggered you run risk of compromising your state and never being able to re-create the state. If you are level triggered the pattern is very forgiving, and allows room for components not behaving as they should to be rectified. This is what makes Kubernetes work so well.”
― Joe Beda, CTO of Heptio (As quoted in , by Justin Garrison and Kris Nova)
Edge and level triggering for the same signal.
Edge and level triggering are concepts that come from electronics and . They refer to how a system should respond to the shape of an electrical signal (or digital logic) over time. Should the system care about when the signal changes from low to high and high to low, or should it care about if the signal is at high?
To explain it another way, given the following simple addition:
> let a = 3;> a += 4;< 7
In an edge triggered view of the operation, we would see:add 4 to a
This would happen once, at the time of the addition. In a level triggered view of the operation, we would see:a is 7
We’d see this continuously from the time of the addition, until the next event occurs.How edge and level triggered systems interpret a signal. Under ideal conditions, both edge triggered and level triggered systems observe a correct view of the signal. Immediately after the signal transitions from on to off, they both see the signal as being in an off state.
Disrupting the rise and fall loses the high signal for the edge triggered system, but arrives at the correct end state. With two disruptions placed around the first two changes to signal state, the differences between edge and level triggered systems are clear. The edge triggered view of the signal misses the first rise. The level triggered system assumes the signal is in its last observed state until it sees otherwise. This leads to an observed signal that is mostly correct, but delayed until after the disruption.
A single well-placed disruption can have a large impact on the edge triggered system. Fewer disruptions doesn’t always lead to a better outcome. With a single disruption obscuring the fall from high back to low, the level triggered system is mostly correct again. The edge triggered system only sees two rises, leading to a state that the original signal was never in. To express this with addition again, the signal expressed:
> let a = 1;> a += 1;> a -= 1;> a += 1;< 2
But the edge triggered system observed:
> let a = 1;> a += 1;> a += 1;< 3
Kubernetes is not just observing one signal, but two: the desired state of the cluster, and the actual state. The desired state is the state that humans using the cluster wish for it to be in (“Run two instances of my application container”). The actual state ideally matches the desired state, but it is subject to any number of hardware failures and malicious rodents. These can move it away from the desired state. Even time is a factor, as it isn’t possible to instantly have the actual state match the desired state. Container images have to download from the registry, applications need time for graceful shutdown, and so on.
Kubernetes has to take the actual state, and reconcile it with the desired state. It does so continuously, taking both states, determining the differences between them, and applying whatever changes have to be made to bring the actual state towards the desired state.
In an edge triggered system, we could diverge wildly from our desired outcome. Even without disruptions to the network, an edge triggered system trying to reconcile two states could end up with an incorrect outcome. If we start with a single container replica, and wish to scale to 5 replicas, then down to two replicas, an edge triggered system would see the following for the desired state:
> let replicas = 1;> replicas += 4;> replicas -= 3;
The actual state of the system cannot react instantly to these commands. As in the diagram, it can end up terminating 3 replicas when there are only 3 running. This leaves us with 0 replicas instead of the desired 2. In a level triggered system, we always compare the complete desired and actual states. This reduces the chances of state desynchronization (a bug).Special thanks to for the diagrams included in this article.