kfan / rpms / kexec-tools

Forked from rpms/kexec-tools 3 years ago
Clone
arthur 11bb478
Kdump-in-cluster-environment HOWTO
arthur 11bb478
arthur 11bb478
Introduction
arthur 11bb478
arthur 11bb478
Kdump is a kexec based crash dumping mechansim for Linux. This docuement 
arthur 11bb478
illustrate how to configure kdump in cluster environment to allow the kdump 
arthur 11bb478
crash recovery service complete without being preempted by traditional power
arthur 11bb478
fencing methods. 
arthur 11bb478
arthur 11bb478
Overview
arthur 11bb478
arthur 11bb478
Kexec/Kdump
arthur 11bb478
arthur 11bb478
Details about Kexec/Kdump are available in Kexec-Kdump-howto file and will not
arthur 11bb478
be described here.
arthur 11bb478
arthur 11bb478
fence_kdump
arthur 11bb478
arthur 11bb478
fence_kdump is an I/O fencing agent to be used with the kdump crash recovery 
arthur 11bb478
service. When the fence_kdump agent is invoked, it will listen for a message 
arthur 11bb478
from the failed node that acknowledges that the failed node is executing the 
arthur 11bb478
kdump crash kernel. Note that fence_kdump is not a replacement for traditional
arthur 11bb478
fencing methods. The fence_kdump agent can only detect that a node has entered
arthur 11bb478
the kdump crash recovery service. This allows the kdump crash recovery service
arthur 11bb478
complete without being preempted by traditional power fencing methods. 
arthur 11bb478
arthur 11bb478
fence_kdump_send
arthur 11bb478
arthur 11bb478
fence_kdump_send is a utility used to send messages that acknowledge that the 
arthur 11bb478
node itself has entered the kdump crash recovery service. The fence_kdump_send
arthur 11bb478
utility is typically run in the kdump kernel after a cluster node has 
arthur 11bb478
encountered a kernel panic. Once the cluster node has entered the kdump crash 
arthur 11bb478
recovery service, fence_kdump_send will periodically send messages to all 
arthur 11bb478
cluster nodes. When the fence_kdump agent receives a valid message from the 
arthur 11bb478
failed nodes, fencing is complete.
arthur 11bb478
2066e5f
How to configure Pacemaker cluster environment:
arthur 11bb478
2066e5f
If we want to use kdump in Pacemaker cluster environment, fence-agents-kdump
2066e5f
should be installed in every nodes in the cluster. You can achieve this via
2066e5f
the following command:
arthur 11bb478
arthur 11bb478
  # yum install -y fence-agents-kdump
arthur 11bb478
arthur 11bb478
Next is to add kdump_fence to the cluster. Assuming that the cluster consists 
arthur 11bb478
of three nodes, they are node1, node2 and node3, and use Pacemaker to perform
arthur 11bb478
resource management and pcs as cli configuration tool. 
arthur 11bb478
arthur 11bb478
With pcs it is easy to add a stonith resource to the cluster. For example, add
arthur 11bb478
a stonith resource named mykdumpfence with fence type of fence_kdump via the 
arthur 11bb478
following commands:
arthur 11bb478
  
arthur 11bb478
   # pcs stonith create mykdumpfence fence_kdump \
arthur 11bb478
     pcmk_host_check=static-list pcmk_host_list="node1 node2 node3"
arthur 11bb478
   # pcs stonith update mykdumpfence pcmk_monitor_action=metadata --force
arthur 11bb478
   # pcs stonith update mykdumpfence pcmk_status_action=metadata --force
arthur 11bb478
   # pcs stonith update mykdumpfence pcmk_reboot_action=off --force
arthur 11bb478
   
arthur 11bb478
Then enable stonith
arthur 11bb478
   # pcs property set stonith-enabled=true
arthur 11bb478
arthur 11bb478
How to configure kdump:
arthur 11bb478
2066e5f
Actually there are two ways how to configure fence_kdump support:
2066e5f
2066e5f
1) Pacemaker based clusters
2066e5f
     If you have successfully configured fence_kdump in Pacemaker, there is
2066e5f
     no need to add some special configuration in kdump. So please refer to
2066e5f
     Kexec-Kdump-howto file for more information.
2066e5f
2066e5f
2) Generic clusters
2066e5f
     For other types of clusters there are two configuration options in
2066e5f
     kdump.conf which enables fence_kdump support:
2066e5f
2066e5f
       fence_kdump_nodes <node(s)>
2066e5f
            Contains list of cluster node(s) separated by space to send
2066e5f
            fence_kdump notification to (this option is mandatory to enable
2066e5f
            fence_kdump)
2066e5f
2066e5f
       fence_kdump_args <arg(s)>
2066e5f
            Command line arguments for fence_kdump_send (it can contain
2066e5f
            all valid arguments except hosts to send notification to)
2066e5f
2066e5f
     These options will most probably be configured by your cluster software,
2066e5f
     so please refer to your cluster documentation how to enable fence_kdump
2066e5f
     support.
2066e5f
2066e5f
Please be aware that these two ways cannot be combined and 2) has precedence
2066e5f
over 1). It means that if fence_kdump is configured using fence_kdump_nodes
2066e5f
and fence_kdump_args options in kdump.conf, Pacemaker configuration is not
2066e5f
used even if it exists.