Solved: Re: Unintented HA failover events

vmproteau · ‎11-18-2008

In our environment, it appears that network changes or Host misconfiguration has a greater liklihood of triggering an HA event than an actual Host failure.

The things I've done to reduce these events:

Always having 2-pNics on the Service Console vSwitch.
Each NIC goes to a seperate physical switch.
Spanning tree protocol (STP)- disable STP on physical network interfaces connected to the ESX Server host. For Cisco-based networks, enable port fast mode for access interfaces or portfast trunk mode for trunk interfaces (saves about 30 seconds during initialization of the physical switch port).
Etherchannel negotiation, such as PAgP or LACP - must be disabled because they are not supported.
Trunking negotiation (saves about four seconds).

What other things could I do?

BUGCHK · ‎11-20-2008

You could increase the timeout: das.failuredetectiontime (milliseconds). In most environments 15 seconds is a bit too eager, I think.

View solution in original post

weinstein5 · ‎11-18-2008

Also have a second service console port on a different network segment -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

vmproteau · ‎11-18-2008

I've considered that so, with a 2nd Service Console, a Host isn't considered isolated unless both IPs are inaccessible? Is that correct?

BUGCHK · ‎11-20-2008

You could increase the timeout: das.failuredetectiontime (milliseconds). In most environments 15 seconds is a bit too eager, I think.

vmproteau · ‎11-20-2008

Agreed, 15 seconds is a relative hair trigger and I remmebered that I had already set this to 60 seconds.

All

Unintented HA failover events