VMware Cloud Community
froulund
Contributor
Contributor

Stateless event alarm

Hi

I get this event every 5 min. from my vCenter Server but only for some off mine 18 host.??

When restart vCenter Server, the event disapear for wile??

Thanks

Henrik Froulund Hansen

-


Target: esx07.int.unicon.dk

Stateless event alarm

Alarm Definition:

( OR )

Event details:

Error detected on esx07.int.unicon.dk in Brondby: Agent can't send heartbeats.msg size: 612, sendto() returned: Operation not permitted

-


0 Kudos
8 Replies
AndreTheGiant
Immortal
Immortal

The errors are always on the same hosts?

All 18 hosts are in a same cluster?

Or different clusters in the same datacenter?

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
froulund
Contributor
Contributor

It seems to be a problem when I reboot any off the host.

And the messages disapear for all the serveres, if I restart the vCenter service!!

There is no alarm in the client!

/Henrik

0 Kudos
AndreTheGiant
Immortal
Immortal

You have a single cluster?

Which version of ESX and VC are you using?

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
froulund
Contributor
Contributor

One cluster with 5 host + 13 host with out cluster

It's ESX4.0 and VC 4.0.0 build 162856 :o))

This is from the latest test:

-


Target: esx01.int.unicon.dk

Stateless event alarm

Alarm Definition:

( OR OR OR OR )

Event details:

Host esx01.int.unicon.dk in Brondby is not responding

-


And after a restart off VC-server

-


Target: Datacenters

Stateless event alarm

Alarm Definition:

()

Event details:

VCSERVER02.int.unicon.dk (vCenter) status changed from red to green

-


But there is now alarm inside the VC-client???

Regards/

Henrik

0 Kudos
PeteKowalsky
Enthusiast
Enthusiast

Hey froulund --

This happens to me too and is VERY annoying (alerts are forwarded to my team's mobile devices). I turned on the alarm for "Host Error", and I believe it's a repeating alarm by default. (Obviously if you set it not to repeat it will be MUCH less annoying.) As for the root cause, I don't know for sure, but here's what I've been able to determine:

1) It happens at roughly the same time of day when it does happen.

2) It seems to be related to a scheduled task with VMware Update Manager scanning hosts (6:00AM every day for me), and when VUM scans the hosts for updates and patches, it appears that the vc.Integrity (agent?) modifies the host firewall in real time so it can perform the scan.

3) I can reproduce the error in my environment on random hosts in any cluster by running the scheduled task manually.

4) It seems that this is a transient error that resolves itself shortly after the alarm condition is created once the host firewall is "re-adjusted".

WORKAROUND: My solution that works thus far is to modify the "Host error" alarm definition by adding an advanced host error trigger condition for "Full message" that is not equal to the string "Agent can't send heartbeats.msg size". Voila. Annoying alarm pages / emails averted but still logged, and transient condition passes without freaking everybody out or annoying the crap out of everybody... Smiley Wink

If this is helpful or "correct", let me know! :smileycool:

UPDATE: The error seems to come most often from the vSphere host that my vCenter VM is running on. I vmotioned it to another host, re-ran the update task, and the heartbeats.msg host error now occurs on the new host running my vCenter VM. Still got no "alarms" or annoying pages tho... :smileygrin:

Regards,

Pete Kowalsky - VCP3, CISSP, CCNP, CCSP, ADHD, blah blah blah

(Points for helpful / correct are appreciated...)

Regards, *Pete Kowalsky - VCP3, VCP4, CISSP, CCNP, CCSP, ADHD, blah blah blah* +(Points for helpful / correct are appreciated...)+
froulund
Contributor
Contributor

Hi

After I deletet all in "host error" re instertet them i don't have any "fake" alarms.

So right now I am happy Smiley Happy

Thanks to all and have a nice sommer!

/Henrik

0 Kudos
mrcbldn
Contributor
Contributor

Hello,

also in my case the fake alarm stopped after recreating Host Error from triggered alarm.

Marco.

0 Kudos
ABKICT
Contributor
Contributor

Why not, as a workaround, restart the failing service after the update proces has finished?

Would that not be more elegant solution?

0 Kudos