VMware Cloud Community
mawelch
Contributor
Contributor

Disconnected host and vmware-aam frozen

Hey,

Yesterday we had two hosts fail in our cluster. I believe this was because we over commited resources and it started causing problems with our HA/DRS setup.

On each host we had errors like this:

5/4/2009 11:32:24 AM, HA Agent on <host> in cluster <> in <> has an error

These errors would happen every 3 minutes on both hosts. On a few, there was a related event error that said "Insuficient resources to satisfy HA failover level on cluster <cluster> in <datacenter>."

About 30 minutes into the errors, both hosts showed a disconnected state in VIC.

At this point I went to the console to try and restart the management services so that I could reconnect them to the cluster and adjust my resources. The console though would freeze, unable to move any highlights.

After searching the internet, there were a few suggestions to try going to the terminal and issuing /sbin/services.sh restart. This would hang on stopping vmware-aam and would sit there and do nothing.

Luckily the VM's were still running, so we were able to take a maintenence window and take the VM's down and reboot the blades.

Has anyone run into this situation and if so any solutions so that we don't have to take an outage hit because of this?

matt

Tags (2)
0 Kudos
2 Replies
AndreTheGiant
Immortal
Immortal

Are you sure that the network is working fine?

How many ESX node do you have?

Andrea

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
jshiplett
Enthusiast
Enthusiast

You're running out of inodes on the affected hosts. This is an issue with vCenter 2.5u3. From the 2.5u4 release notes:

-


ESX Server 3i Installations Run Out of Inodes and Stop Responding to the VirtualCenter Server

In previous releases, when an ESX Server 3i host in an HA-enabled

cluster is isolated from the VirtualCenter network and reconnected back

to the VirtualCenter network, after a while no free inodes might be

available for the ESX Server 3i and the ESX Server 3i host might stop

responding to the VirtualCenter Server and the VI Client.

This issue is resolved.

-


Updating to vCenter 2.5u4 should fix the issue.

VCAP4-DCD - VCP 3/4/5, VCP-Cloud, CCNA, vExpert '12/'13 - http://blog.shiplett.org
0 Kudos