Re: Triggered alerts not clearing

sxnxr · ‎06-28-2018

I have created an alert for Host network redundancy lost. this is using a clone of the OOTB fault symptom

and this is my alert

The alert triggers as expected but stays active and does not clear when the redundancy is restored. Am i doing something wrong?

The problem i have is we use these alerts to auto cut tickets in service now but as long as the alert is still active in vrops it will never trigger again and cut a new ticket

sxnxr · ‎06-28-2018

I am also having the same problem with storage redundancy alerts

mghall · ‎06-28-2018

Seeing the same behavior with a new alert.

We're running vROPs 6.7. The alert was created for a Windows service to check if it was running or not.

Initially it was triggered on the service, not on the server. I had to go back and modify the alert definition so it reported on the server. I've now got both conditions reporting. I've also gone back and verified that the service is running correctly and the EPOP agent is running. Starting to look at the logs now.

daphnissov · ‎06-28-2018

Historically, these types of non-clearing alerts have been confirmed as bugs. I don't know if that's the case here, but if the condition is no longer true in the infrastructure being monitored and the alert isn't clearing in vROps despite correctly-configured wait and cancel cycles, I'd open an SR to get confirmation.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

MeImNot76 · ‎07-09-2018

Hello @mghall

Could you explain briefly how you modified the alert to report on the server rather than on the service please?

Thank you!

sxnxr · ‎07-10-2018

I have a support call open with VMware and they are doing the normal SOP to apply HF9 or go to 6.7 which is great because it takes 3h to do the upgrade to 6.7 and 1h to do HF9 so if ant alerts generate during that time they will not create indecent tickets for our NOC

PLEEEEEESE give me a no down time upgrade

daphnissov · ‎07-10-2018

They're giving you a hotfix for vROps 6.6.1?

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

sxnxr · ‎07-10-2018

Yep Hot Fix 9 ( i already have for a different problem to do with policies not being pushed out to all nodes in the cluster)

daphnissov · ‎07-10-2018

Can you explain the contents of this hot fix, please? Just in case others come across this thread, what symptoms are you seeing that necessitated this (if different from host disconnection alerts not clearing).

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

RickVerstegen · ‎07-24-2018

I am experiencing the same issues mentioned in this thread with 6.6.1. Will there be a public hotfix/patch be released for this?

I have the issues related to disk space.

Was I helpful? Give a kudo for appreciation!
Blog: https://rickverstegen84.wordpress.com/
Twitter: https://twitter.com/verstegenrick

sxnxr · ‎08-08-2018

This is all the info i was given on HF9

This HF will address general issues like VSAN, API, License, alerts and alarms, policy

We applied it because every time you bring the cluster offline and back on again in 6.6.1 NON HF9 it takes up to 10 mins for all you custom groups to run their membership rules and add the objects to them. This was causing us problems because we have different alerting levels set in different policies so on startup all objects were a member of the default policy because the membership rules for the custom groups had not been worked out. HF9 fixes this as all objects stay in there custom group through reboot/offline a cluster.

The second reason was we were getting alarms being generated on some objects when we had them disabled in the policies. It turns out there is a bug that will cause the policy update to not be pushed out to all the nodes in the cluster. Depending on what node was evaluating the alarm trigger it could have been looking at an old policy and triggering the alarm when it should not have been

All

Triggered alerts not clearing