Service Console best settings What happens if default gateway fails in HA?

Service Console best settings What happens if default gateway fails in HA?

I'm just curious about HW and how it works.

At the moment, I have 3 ESX 3.5 hosts and VC 2.5. Initially, the persons  setting them up, out both Service Console and VM's on the same vswitch  and on the same internal VLAN. I have now created more VLAN's, which are  routed on our firewall (Checkpoint). HA was enabled and was working,  but I disabled it when I was setting up new networks and assigning new  IP adresses to the SC.

My question is now: what happens if our firewall (router) fails? I know  that HA pings the default gateway, but doesn't it detect that the other  ESX hosts are still running and does not force a restart of the VM's? I  have setup switch redunacy and all virtual switches have dual or triple  physical NIC's.


The gateway is only pinged to check if the host is isolated (= has no  network connectivity), but this starts only when it stops receiving  heartbeats from the other hosts in the cluster. As you have your service  consoles probably on the same subnet, they would talk even with a dead  GW.


Thanks for the help.

Yes, all my hosts are on the same subnet, so then I know I can survive a downed firewall.


It has a 15 seconds heartbeat and if it found to be isolated than it  restart the VMs on other ESX hosts within your cluster.  HA heavily  depends on your DNS infrastructure and I would place entries in  /etc/hosts file for all ESX hosts to make sure if your DNS failed it  still communicate via host entries.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen
iGeek Systems Inc.
VMware, Citrix, Microsoft Consultant


Best Practice for HA requires redundancy for the Service Console network, this can be accomplished one of two ways:

1. Single Service Console Network with redundant pNICs connected to different pSwitches

2. Secondary Service Console Network. You can create a second SC  portgroup on a new or existing vSwitch, and then configure a second  Isolation Address (under HA Advanced Options set: das.isolationaddress2 =  SecondIPAddress )

Personally I like Option 1 better. Another advanced option you might  want to consider setting is changing the default timeout value:  das.failuredetectiontime = timeinms


I had changed this from the default of 15 sec to 60000 (60 seconds).  This just gives you a little more time before HA thinks you have an  isolated/down ESX server.


The other option is to change the default for Isolation Response from  Power Off to Leave Powered On. This will make sure VMs do not get  powered off for false HA Isolation events. This does mean if the server  really is "isolated" the VMs wont be moved, but they should still be up  and running, because we are not talking about an ESX server being down,  just isolated from the rest of the cluster. IMO its better to leave the  VMs running if they are still up and resolve this problem after business  hours.

Don Pomeroy
VMware Communities User Moderator


" I would place entries in /etc/hosts file for all ESX hosts to  make sure if your DNS failed it still communicate via host entries."
I have a question about this.  From what I read in the HA Best Practices doc (http://kb.vmware.com/Platform/Publishing/attachments/1002080_fHA_Tech_Best_Practices.pdf), the ESX server does this automatically and VMware recommends against manually editing the hosts file.
Here is the text from the doc:
1. Proper DNS & Network settings are needed for initial configuration
After configuration DNS resolutions are cached to /etc/FT_Hosts (minimizing the
dependency on DNS server availability during an actual failover)
DNS on each host is preferred (manual editing of /etc/hosts is error prone)
So what is everyone else doing?

This document was generated from the following thread: What happens if default gateway fails in HA?

Version history
Revision #:
1 of 1
Last update:
‎06-17-2008 05:15 PM
Updated by: