Hi,
Upgraded my two hosts from ESX 3.5 update1 to ESX 3.5 update2 using update manager. I upgraded my VirtualCenter server first. All went well, except I got an error on HA. On one of the hosts I got "HA agent has an error", the cluster showed "could not contact primary HA agent". Happens sometimes, so I selected "reconfigure for HA" on the failing host. This did not work. Ok, so I deselected HA all together, then reselected HA on clster level. Same problem occured.
I finally found the answer: When I looked at the configuration of my hosts, at "DNS and routing", I noticed a little "reboot" sign next to the hostname. Weird, since I did not change the names of the hosts. Reboot of the ESX hosts plus virtualcenter did not solve the problem. Finally I noticed that my hostnames started with a capital letter, and the dns name (obsiously) did not. After changing the hostsname to all non-capital characters, the reboot sign disappeared immediately without any reboot (!!??). Performed a reboot anyway, enabled HA on the cluster and all was well again.
Kinda answered my own question
However, this may help others with HA problems after the upgrade...
It will be interesting to see if this is a bug in the upgrade process - capitalizing the hosts name on the upgraded hosts -
Hi... Done some more testing on other environments... It appears that you get a problem when either the configured hostname does not match your /etc/hosts file when it comes to capital chars.... Another environment had a capital letter in the /etc/hosts file, but none in the configured hostname and it also suffered from HA failures (allthough not so problematic... reconfigure for HA worked in this case)
So basically: If you have HA issues after upgrading to ESX 3.5 update2, make sure you match capitals in your configured hostnames and /etc/hosts file!
Sure thing... VMware used to have it as a best practice to fill the hosts file with the ESX servers and VC... Nowadays I think the best practice has been changed to not filling the hosts file. But I have seen three envirnments updated so far, and all had HA problems in some degree... So keep the capitalisation of your hostnames (and hosts file) in mind!
I have seen problems with capital letters in hostfiles before (On version 3.0.X atleast) as the HA scripts tries/tried to grep for the host name without using "grep -i".
Maybe I'm missing something but in U2 isn't Virtual Center suppose to take over the DNS role. Why is there in that case still a dependancy on the hosts file? It should ignore hosts and DNS entries should it not?
Uhm I think the ESX hosts will check their hosts file first, and DNS only after that. So leaving your hosts file empty might solve the problem as well... It used to be best practice to always fill your hosts file (for HA especially), but nowadays I think VMwares best practice is NOT to fill the hosts file (HA uses its own dynamically built hosts file anyway...)
