I had an ESXi 4 update 1 host which I used update manager to upgrade to 4.1. Upon rejoining the cluster, I receive a red mark and the host states
"HA agent on has an error : Error while running health check script"
Two hosts in a separate cluster do not have the same issue.
Thus far I've
Reconfigured HA
Disabled HA/Enabled HA for the cluster
Restarted management agents
Restarted Server
Put server in another cluster
...all result in the same error.
Any ideas?
I am also trying to resolve this problem and although I have not found a solution, I have started creating an article documenting what I have found in regards to this error. I believe I am very close to finding a solution.
UPDATE: I found the solution!! Read article below:
(Solved) VMware HA Fails after Upgrade to ESXi 4.1
Did you upgrade vCenter to 4.1 1st? Doing so it best practice. (And should fix this error)
Hi,
Thanks for the reply. I did upgrade vCenter first as I used update manager to do the upgrade. Update manager cannot upgrade hosts to 4.1 without upgrading vCenter first.
Oh I seen some peps take it out vCenter then upgrade the host, than try to put it back in. Thought mabyee that's what you did, I have seen this error a lot when clients are running older vCenter after updateing the host.
What does the error message under the new cluster operational status say?
http://www.yellow-bricks.com/2010/07/20/cool-vsphere-4-1-feature-cluster-operational-status/
I have the very same error on a pair of 2850's that I upgraded to 4.1 using Update Manager. I tried the same things Gallwapa did and the error will not go away. The cluster has been very stable all the way since I brought it to 4.0.
Any hint from the community what to check next would be great.
Thanks
I have same problem and worse. I have done all the usual f techniques to fix ha but to no prevail.
It affected all my hosts in cluster. I logged SR with VMware. First step is finding out whether it is vCenter
related or ESXi related...
Please consider marking my answer as "helpful" or "correct"
Hi,
I had the same issue when i did upgrade of vCenter Server to release 4.1. The HA clusters show the same error messages.
Check out this discussion of VMware Communities:
http://communities.vmware.com/message/1571656
The weird thing is this only happen on ESXi hosts. I have a HA cluster with two ESX host 4.0U1 without any issue!!!
I followed the steps mentioned on that discussion, but it's basically the same to re-install the ESXi hosts. It's necessary reconfigure all the host settings, but at least i have the HA clusters working on.!
Regards / Saludos
-
Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!
VMware VCP-410
Join to Virtualizacion en Español group in Likedin
-
Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.
If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.
Yep it is a complete re-install so to speak and even though it may work, I don't think it is acceptable and I hope VMware has an answer for this
Of course it's not acceptable, but until now VMware don't give any answer. So, for now, i don't have any other choice in order to make the HA cluster works!
Regards / Saludos
-
Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!
VMware VCP-410
Join to Virtualizacion en Español group in Likedin
-
Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.
If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.
Well, we had host profiles ready to go so a reinstall took under 10 minutes for me. We couldn't wait for VMWare to come up with an answer as our HA may have been at risk over the weekend.
The use of Host Profiles is a good resource in this case. I had to use them too in order to put my HA cluster working again. Too bad, it was the only issue i got when i did the vCenter Server upgrade.
Regards / Saludos
-
Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!
VMware VCP-410
Join to Virtualizacion en Español group in Likedin
-
Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.
If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.
Did you see any other issues after upgrading to vcenter 4.1 such query service not working or getting time-outs when sorting lists?
Cheers
Please consider marking my answer as "helpful" or "correct"
Besides the error about the HA Agents on ESXi, i have an error message in Converter Health Status. However, i can still use vCenter Converter without any problem.
This is discussed in detail in this link:
http://communities.vmware.com/thread/276101
Regards / Saludos
-
Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!
VMware VCP-410
Join to Virtualizacion en Español group in Likedin
-
Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.
If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.
I mentioned this in the other thread but there is now a KB article for the virtualcenter / vmware converter issue.
Did you see any other issues after upgrading to vcenter 4.1 such query service not working or getting time-outs when sorting lists?
Cheers
Please consider marking my answer as "helpful" or "correct"
I am also seeing this error, but as of now it's at single client site and seems to not to be hurting anything. I will want find the fix before I update the hosts to 4.1
I am having this same darn issue. Has anyone found a fix for this? I really dont want to take down my entire cluster just to fix this problem.
Edit- I was able to get around this by turning off HA on the cluster entirely. Then I am renabled it after a few minutes and everything seemed fine.
Blog: www.virtualizationbuster.com
Twitter: s1xth
I am also trying to resolve this problem and although I have not found a solution, I have started creating an article documenting what I have found in regards to this error. I believe I am very close to finding a solution.
UPDATE: I found the solution!! Read article below:
(Solved) VMware HA Fails after Upgrade to ESXi 4.1
Good information. Glad you posted it.
I believe though, turning HA off at the cluster level may be enough, if not then I can see trying the method of stopping the HA services.
I am still in the process of upgrading my cluster so I am sure I am going to run into this issue again. Hopefully your method or disabling HA on the cluster continues to work correctly.
Sent from my iPhone
I tried everything including disable and reenable of HA on each host and the entire cluster, lots of esx host reboots, etc... Nothing worked until I removed the folders for the HA client (using the uninstall script) from the ESX box as my article instructs.
You could be onto something
I might have to give that a go. Reading KB1007234 though, I have no aam_config_util.def file on my host and I don't see that particular error anywhere either. However we do have a few hosts with a second service console. I always had the allowNetwork0 parameter on my cluster and it never caused a problem before