VMware Cloud Community
jroh
Contributor
Contributor

X times in Y minutes question?

I'm using HQ version - 3.0.4 (build #389 - Apr 27, 2007 - Release Build).. and trying to use the X times in Y minutes alert config option to mitigate temporary failures, load spikes, etc in my alerts. For a number of them I have it set to 2 times in 10 minutes.. which seems to translate to 120 seconds out of 600 seconds in the alert log. However it seems to take only one failure on a service that's monitored every minute to trigger that alert.

My question is how is hyperic determining how long the failure occurred? Is it 120 seconds because that's the time between successful results, or is this a bug and my 1 failure on a service monitored every minute should only count as 60 seconds? Or am I missing something? Thanks!
0 Kudos
6 Replies
admin
Immortal
Immortal

What is your collection interval of your metrics? How often are the
alerts occurring?

Charles


0 Kudos
jroh
Contributor
Contributor

The collection interval is 1 minute and an alert will trigger if 1 interval matches the alert condition.
0 Kudos
admin
Immortal
Immortal

So you mean you have 1 time in 1 minute?
0 Kudos
admin
Immortal
Immortal

Ok, after re-reading your first post and last, I think I understand what you mean. If you have it collecting at 1 minute interval, and set your alert definition to fire every 2 times in 10 minutes, then you'll get an alert every 2 minutes. The 10 minutes sets a boundary for how long to allow for 2 collections to match the condition, but is not an exclusionary period. It doesn't mean that it'll only fire 1 time in 10 minutes, it means that every time you have 2 matches within a 10 minute window, an alert will fire.

If you want to restrict the number of alerts you receive, you should do it through the escalation scheme. Select "Suppress Alerts" for 10 minutes. That way every time an alert fires, the app will wait 10 minutes before firing another one.

Charles
0 Kudos
jroh
Contributor
Contributor

Thanks for responding, Charles. The issue I seem to be facing is that if I set the alert to fire for every 2 times in 10 minutes, on occasion it seems to fire when there is only 1 failure with that 10 minute window. My issue isn't so much that it continues to fire, but that fire pre-maturely. Thanks!
0 Kudos
jroh
Contributor
Contributor

Just a couple screenshots to hopefully display what I'm seeing.
0 Kudos