Hi All,
Can anyone tell me what exactly the SPLIT BRAIN concept in VMware is? When
does this occur and how is it handled by the servers?
I have done my ESX 4.0 training, as the course content was too large and the time duration was to short, not all the topics were clearly covered.
Thanks.
J
From page 13 of http://www.vmware.com/pdf/vmware_ha_wp.pdf
You can change this default behavior for individual virtual machines and choose Leave
running to indicate the virtual machine on isolated hosts should continue running even if
the host can no longer communicate with other hosts in the cluster. If you choose to do
this and it turns out that the original host can't access shared storage, the virtual machine
lock will time out and the virtual machine may be started on a second host (a condition
commonly referred to as split-brain). This condition is more likely to occur with NAS or iSCSI
storage, in the case of network failures, since both methods are TCP/IP based. For these
types of storage, keeping the Isolation Response at Power off (the default) is highly
recommended.
Andre
Thanks for the reply,
I will go thru the document and get back if at all I have any doubts.
Much appreciated
It is in reference to a VMware HA cluster when one node becomes isolated from the other nodes - so remaining nodes believe there has been a failure but the vms are still running on the original node - another document to check out http://www.vmware.com/pdf/vsphere4/r40/vsp_40_availability.pdf particularly the sections on isolation response -
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
Hi All,
Well I still have a bit of confusion on this term "SPLIT BRAIN"
I went thru the doc proided but it does not explain more about the concept.
What does Split Brain do and when does it occur?
Please advice
Thanks,
J
As I indicated Split-Brain is a term used to refer when an ESX server becomes isolated from the other hosts in an HA cluster - this occurs when the isolated host loses network connectivity and can not reach the cluster -
Once the ESX server determines it is isolated it will follow the configured Isolation Response for each VM either shut it down or leave it running
The cluster treates this as a failure and will attepmt to restart the VMs on one of the remaining nodes but can not since the VMs are still running on the isolated host - it will continue to do this until the VM is shut down
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful