VMware Cloud Community
Gallwapa
Contributor
Contributor
Jump to solution

After 4.1 upgrade, 1 host "Error while running health check script" - What to check next?

I had an ESXi 4 update 1 host which I used update manager to upgrade to 4.1. Upon rejoining the cluster, I receive a red mark and the host states

"HA agent on has an error : Error while running health check script"

Two hosts in a separate cluster do not have the same issue.

Thus far I've

Reconfigured HA

Disabled HA/Enabled HA for the cluster

Restarted management agents

Restarted Server

Put server in another cluster

...all result in the same error.

Any ideas?

Reply
0 Kudos
1 Solution

Accepted Solutions
computerguy7
Enthusiast
Enthusiast
Jump to solution

I am also trying to resolve this problem and although I have not found a solution, I have started creating an article documenting what I have found in regards to this error. I believe I am very close to finding a solution.

UPDATE: I found the solution!! Read article below:

(Solved) VMware HA Fails after Upgrade to ESXi 4.1

View solution in original post

Reply
0 Kudos
33 Replies
tietzjd25
Enthusiast
Enthusiast
Jump to solution

Did you upgrade vCenter to 4.1 1st? Doing so it best practice. (And should fix this error)

Joe Tietz VCAP-DCD Solutions Architect
Reply
0 Kudos
Gallwapa
Contributor
Contributor
Jump to solution

Hi,

Thanks for the reply. I did upgrade vCenter first as I used update manager to do the upgrade. Update manager cannot upgrade hosts to 4.1 without upgrading vCenter first.

Reply
0 Kudos
tietzjd25
Enthusiast
Enthusiast
Jump to solution

Oh I seen some peps take it out vCenter then upgrade the host, than try to put it back in. Thought mabyee that's what you did, I have seen this error a lot when clients are running older vCenter after updateing the host.

What does the error message under the new cluster operational status say?

http://www.yellow-bricks.com/2010/07/20/cool-vsphere-4-1-feature-cluster-operational-status/

Joe Tietz VCAP-DCD Solutions Architect
Reply
0 Kudos
gerdpeterw
Contributor
Contributor
Jump to solution

I have the very same error on a pair of 2850's that I upgraded to 4.1 using Update Manager. I tried the same things Gallwapa did and the error will not go away. The cluster has been very stable all the way since I brought it to 4.0.

Any hint from the community what to check next would be great.

Thanks

Reply
0 Kudos
AllBlack
Expert
Expert
Jump to solution

I have same problem and worse. I have done all the usual f techniques to fix ha but to no prevail.

It affected all my hosts in cluster. I logged SR with VMware. First step is finding out whether it is vCenter

related or ESXi related...

Please consider marking my answer as "helpful" or "correct"

Please consider marking my answer as "helpful" or "correct"
Reply
0 Kudos
pcerda
Virtuoso
Virtuoso
Jump to solution

Hi,

I had the same issue when i did upgrade of vCenter Server to release 4.1. The HA clusters show the same error messages.

Check out this discussion of VMware Communities:

http://communities.vmware.com/message/1571656

The weird thing is this only happen on ESXi hosts. I have a HA cluster with two ESX host 4.0U1 without any issue!!!

I followed the steps mentioned on that discussion, but it's basically the same to re-install the ESXi hosts. It's necessary reconfigure all the host settings, but at least i have the HA clusters working on.!




Regards / Saludos

-


Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!

VMware VCP-410

Join to Virtualizacion en Español group in Likedin

See My Blog

See My Linkedin Profile

-


Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.

If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.

Regards / Saludos - Patricio Cerda - vExpert 2011 / 2012 / 2013
Reply
0 Kudos
AllBlack
Expert
Expert
Jump to solution

Yep it is a complete re-install so to speak and even though it may work, I don't think it is acceptable and I hope VMware has an answer for this Smiley Happy

Please consider marking my answer as "helpful" or "correct"
Reply
0 Kudos
pcerda
Virtuoso
Virtuoso
Jump to solution

Of course it's not acceptable, but until now VMware don't give any answer. So, for now, i don't have any other choice in order to make the HA cluster works! Smiley Sad






Regards / Saludos

-


Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!

VMware VCP-410

Join to Virtualizacion en Español group in Likedin

See My Blog

See My Linkedin Profile

-


Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.

If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.

Regards / Saludos - Patricio Cerda - vExpert 2011 / 2012 / 2013
Reply
0 Kudos
Gallwapa
Contributor
Contributor
Jump to solution

Well, we had host profiles ready to go so a reinstall took under 10 minutes for me. We couldn't wait for VMWare to come up with an answer as our HA may have been at risk over the weekend.

Reply
0 Kudos
pcerda
Virtuoso
Virtuoso
Jump to solution

The use of Host Profiles is a good resource in this case. I had to use them too in order to put my HA cluster working again. Too bad, it was the only issue i got when i did the vCenter Server upgrade.




Regards / Saludos

-


Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!

VMware VCP-410

Join to Virtualizacion en Español group in Likedin

See My Blog

See My Linkedin Profile

-


Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.

If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.

Regards / Saludos - Patricio Cerda - vExpert 2011 / 2012 / 2013
Reply
0 Kudos
AllBlack
Expert
Expert
Jump to solution

Did you see any other issues after upgrading to vcenter 4.1 such query service not working or getting time-outs when sorting lists?

Cheers

Please consider marking my answer as "helpful" or "correct"

Please consider marking my answer as "helpful" or "correct"
Reply
0 Kudos
pcerda
Virtuoso
Virtuoso
Jump to solution

Besides the error about the HA Agents on ESXi, i have an error message in Converter Health Status. However, i can still use vCenter Converter without any problem.

This is discussed in detail in this link:

http://communities.vmware.com/thread/276101




Regards / Saludos

-


Patricio Cerda !http://www.images.wisestamp.com/linkedin.png!

VMware VCP-410

Join to Virtualizacion en Español group in Likedin

See My Blog

See My Linkedin Profile

-


Si encuentras que esta o cualquier otra respuesta ha sido de utilidad, vótalas. Gracias.

If you find this or any other answer useful please consider awarding points by marking the answer helpful or correct. Thank you.

Regards / Saludos - Patricio Cerda - vExpert 2011 / 2012 / 2013
Reply
0 Kudos
Gallwapa
Contributor
Contributor
Jump to solution

I mentioned this in the other thread but there is now a KB article for the virtualcenter / vmware converter issue.

Reply
0 Kudos
tietzjd25
Enthusiast
Enthusiast
Jump to solution

Did you see any other issues after upgrading to vcenter 4.1 such query service not working or getting time-outs when sorting lists?

Cheers

Please consider marking my answer as "helpful" or "correct"

I am also seeing this error, but as of now it's at single client site and seems to not to be hurting anything. I will want find the fix before I update the hosts to 4.1

Joe Tietz VCAP-DCD Solutions Architect
Reply
0 Kudos
s1xth
VMware Employee
VMware Employee
Jump to solution

I am having this same darn issue. Has anyone found a fix for this? I really dont want to take down my entire cluster just to fix this problem.

Edit- I was able to get around this by turning off HA on the cluster entirely. Then I am renabled it after a few minutes and everything seemed fine.

Blog: www.virtualizationbuster.com

Twitter: s1xth

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
computerguy7
Enthusiast
Enthusiast
Jump to solution

I am also trying to resolve this problem and although I have not found a solution, I have started creating an article documenting what I have found in regards to this error. I believe I am very close to finding a solution.

UPDATE: I found the solution!! Read article below:

(Solved) VMware HA Fails after Upgrade to ESXi 4.1

Reply
0 Kudos
s1xth
VMware Employee
VMware Employee
Jump to solution

Good information. Glad you posted it.

I believe though, turning HA off at the cluster level may be enough, if not then I can see trying the method of stopping the HA services.

I am still in the process of upgrading my cluster so I am sure I am going to run into this issue again. Hopefully your method or disabling HA on the cluster continues to work correctly.

Sent from my iPhone

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
computerguy7
Enthusiast
Enthusiast
Jump to solution

I tried everything including disable and reenable of HA on each host and the entire cluster, lots of esx host reboots, etc... Nothing worked until I removed the folders for the HA client (using the uninstall script) from the ESX box as my article instructs.

Reply
0 Kudos
AllBlack
Expert
Expert
Jump to solution

You could be onto something Smiley Happy

I might have to give that a go. Reading KB1007234 though, I have no aam_config_util.def file on my host and I don't see that particular error anywhere either. However we do have a few hosts with a second service console. I always had the allowNetwork0 parameter on my cluster and it never caused a problem before

Please consider marking my answer as "helpful" or "correct"
Reply
0 Kudos