Hello Fellow Administrators , Today we will be discussing about STONITH
Shoot The Other Node In The Head it is nothing but a service that helps in maintaining the integrity of nodes in a HA Cluster.
What does STONITH actually does ?
So you have a HA cluster , with Primary and HA server listed in it. In the scenario where one of the server is not working correctly or failover scenario the attached HA server will automatically come up as the primary or the fault system will be stopped and not allowed to start.
Fencing goes side ways with STONITH. Fencing is the method to bring a cluster to a known state.
So what is done in fencing ?
A cluster sometimes detects that one of the nodes is behaving strangely and needs to remove it. Every resource in a cluster has a state attached . The cluster must make sure that every resource may be started on only one node (i.e. only HA or only Primary is active )
Every node must report every change that happens to a resource. The cluster state is thus a collection of resource states and node states.
When the state of a node or resource cannot be established with certainty, fencing comes in. Even when the cluster is not aware of what is happening on a given node, fencing can ensure that the node does not run any important resources.
Fencing is of two types :-
List of STONITH devices : stonith -L or crm ra list stonith
Let's understand why we said that it maintains the integrity :
“Split brain scenario”, and this may result in bad things happening to the cluster resources. Imagine, for example, a database that starts running twice in the cluster or a file system that starts to be written between two independent nodes. So, having a split brain in the cluster is bad, and the only way to ensure that no such scenario can occur in the cluster is by using the STONITH approach.
What happens actually ?
Cluster resources are not in sync and each node in the cluster believes it is the only active cluster.
To avoid this, we can configure Split Brain Detection (SBD) as node fencing mechanism to shut down the device in case of split-brain scenario. SBD provides a node fencing mechanism for Pacemaker-based clusters through the exchange of messages via shared block storage.
What does STONITH do in case of Split brain scenario ?
STONITH (Shoot the Other Node in The Head), is a basically a fencing mechanism which powers down the selected server remotely, removing it from cluster and allowing other nodes in the cluster to take over.
Different STONITH approaches
- Disk-based STONITH: external/sbd (On Premise – Best Practice)
- Hardware-based STONITH: external/ipmi (On Premise – Second Choice)
- GCP STONITH: external/gcpstonith (Google Cloud)