Re: Reconfiguring HA on all servers because of cor...

NNy2k · ‎10-28-2009

Hopefully I'm in the correct community...

Over the weekend we had some maintenance performed on our core switch and as a result needed to reload it a couple of times.

During the reloads we observed that the VM servers (we'll call them VM1 thru VM5) all reported "reconfiguring HA" and stayed at 50% progress for some time. Now while this was happening we noticed that VM2 took over all of the VM's and did a fair job despite being dramatically more challenged that it should have been.We can then migrate the VM's across the other 4 servers without issue but then upon reloading the switch again the same thing happened.

Our network isn't too complicated, one core switch, ESX servers all have 2 gig network ports active, no VLAN

I'm just hoping for some thoughts on where I can start to look. We plan to open a ticket with support as well but like to do our own investigating too.

Thanks much!

Joe

wardb0071 · ‎10-28-2009

Hi Joe,

I had this exact same problem a few months back. I believe this is happening because you have the core switch IP address set as your isolation network in your Cluster.

You should set the isolation network to an IP address redundant like rudundant core switches. Another option if you don't have a redundant Firewall or Switch is to set the isolation timeout response for the cluster to a higher interval.

See Attached file.

Hope this helps

Cheers,

Brian

NNy2k · ‎10-28-2009

Thanks Brian, this is just what I was looking for. I'm still relatively green on ESX and appreciate the help!

All

Reconfiguring HA on all servers because of core switch restart?