i got couple of question on how heartbeat mechanishm work with HA ( vsphere 5.0)
After going through documents, ther two types of heartbeating mechanism used
(i) network heartbeat (traditional method)
(ii)datastore heartbeat (Master agent uses it to correctly validate status of slaves when master cannot communicate with slaves via management network)
my question really is how both actually works?
Network heart beat
i understand that , master sends heartbeat to all of its slaves, all slave sends heartbeat to its master. slaves do not send heartbeats to eachother.
what information is actually exchanged through heartbeat, it sends some kind of text message to master indication that it is alive?. can that be viewed through logs
if it sends heartbeat every second, will there be a heartbeat log in the logfile for every second. what is actually meaning of heartbeat here.
it uses "heartbeat region file". i understand that, on nfs datastore, datastore heartbeat files(host-number-hb) touched every 5 seconds by corresponding host, and ha(master) validates it using timestamp of the file.
we actually use VMFS volumes, datastore that are chosen for heartbeating has "host-<number>-hb" file created for each host in two datastore chosen for heartbeating. these files will never be updated right? Please confirm. i see that timestamp of these files never changed since HA configured i think.
so HA(master) validates the status of hosts by checking whether these files are open by corresponsing host right.
For example, 2 noes in the cluster and its files are below
node 1 - host-1419-hb
node 2 - host-1494-hb
so HA validates status of node1 by checking host-1419-hb file is open by nod1 right. respectively for node2. but none will be written or updated to these files
by respective hosts right.
is there way for us to find out which file is opened by which host on the command line, how does HA validates file is opened by which host
For network heartbeating, the heartbeat is in binary format, not text and cannot be viewed in the logs. The heartbeat contains some counters and internal state that the master uses to keep in sync with the slaves and visa versa.
For datastore heartbeating on vmfs datastores - the slaves just open their corresponding file with a special kind of exclusive lock (the files themselves are not updated). This causes the ESX kernel to periodically update a counter in the heartbeat region of the vmfs datastore. Each host with access to the datatore has it's own entry in this heartbeat region. The HA master can check if the entry in the heartbeat region for a specific host is being updated which is an indication of liveness. The master does not need to check which host is locking each particular heartbeat file (there isn't a way to check this on the command-line).
Frantisek Ferencik wrote:
does this mean that HA 5.0 is using same "heartbeat region" in VMFS as HA in previous versions?
HA never really used the heartbeat region in the previous version for these reasons.