VMware Cloud Community
WM280
Enthusiast
Enthusiast

Alarm State = Not responding AND State = Unknown, however host is online with no errors

Dear Community

I am running vCenter 5.5 which manages several remote ESX5.5 hosts.

Over the weekend, I received e-mail alerts related to some of these remote hosts

Target: x.x.x.x

Previous Status: Green

New Status: Red

Alarm Definition:

([Red state Is equal to notResponding] AND [Red state Not equal to standBy])

Current values for metric/state:

State = Not responding AND State = Unknown

Description:

Alarm 'Host connection and power state' on x.x.x.x changed from Green to Red

My Settings for "Host connection and power state" alarm definition is to also send an e-mail when it goes back to Green status, however I never received any e-mails after the above.

Checking my ESX hosts, I don't see any problems at all. Hardware Status also is "all green".

Any advise on troubleshooting steps?

Reply
0 Kudos
3 Replies
greco827
Expert
Expert

Are there any indications in the vmkernel or vpxa logs indicating any loss of connectivity to/from vCenter?

If you find this or any other answer useful please mark the answer as correct or helpful https://communities.vmware.com/people/greco827/blog
Reply
0 Kudos
WM280
Enthusiast
Enthusiast

Thank you for the advise.

I had a look at those log files but did not see anything specific.

The hosts which got disconnected (or for which these alarms were generated), are located in remote datacenters.

Connection is via internet (IPSec tunnel), so there could be short periods of high latency which could cause the vCenter connection to time out.

I was also checking the alert settings and confirmed that I should receive a notification when the connectivity is re-established.

Last night, I received a notification for one of the same hosts again (Previous Status: Green New Status: Red), however this time around I received another notification shortly after (Previous Status: Red New Status: Green).

Since all my ESX hosts appears online and working, I will leave this issue aside for now and monitor for re occurrences.

Reply
0 Kudos
greco827
Expert
Expert

It could definitely be a matter of network latency or lack of bandwidth.  Since it is remote, the WAN is likely you're bottleneck.  I would think that your network team could shed some light on that.

If you find this or any other answer useful please mark the answer as correct or helpful https://communities.vmware.com/people/greco827/blog
Reply
0 Kudos