VMware Cloud Community
gmatney
Enthusiast
Enthusiast

Trying to understand alert - Enable Action: Once every _ times conditions are met within a time period of _ minutes.

I created an availability monitor using a script plugin that runs every 5 minutes.  I then created an alert with this:

If Condition:Availability < 100.0% of 100.0% (Max Value)

Enable Action: Once every 2 times conditions are met within a time period of 6 minutes.

The alert triggers on a single bad sample (return code 2 from script vs 0).  The documentation doesn't go into any detail on exactly how "x times within x minutes" actually works, but my assumption would be that it would take 2 failures for the alert to trigger.  Also, I would assume that 2 consecutive failed samples (5 minutes apart) would cause the alert to trigger because they both happened within a time period of 6 minutes.

Am I wrong about how this works or is this a bug?

Tags (2)
0 Kudos
3 Replies
admin
Immortal
Immortal

Hi,

You set the availability time interval to 5 minutes and your Alert condition is once every 2 times so the minimum time period that your condition will be satisfied is 10 minutes.

Please change the time period to be greater than 10 minutes (even 10 minutes is o.k) and it should work for you.

Tal

0 Kudos
gmatney
Enthusiast
Enthusiast

It is triggering - after a single occurrence.

I'm not sure I understand the 10 minute minimum you speak of.  It doesn't take 10 minutes to get 2 samples 5 minutes apart.  For example, sample 1 is at 6:00, sample 2 is at 6:05.  If both of those show the resource as unavailable, why wouldn't it trigger rather than waiting until 10 minutes?  But that's not my problem anyway - I've had no problem getting the alert to trigger.  My problem is that it's triggering on a single failed sample.

0 Kudos
gmatney
Enthusiast
Enthusiast

I think I'll give up on this one.  I've tried to recreate the too-quick alerts but when I force the bad return codes to test the problem, it triggers exactly like I want.  Two consecutive bad return codes 5 minutes apart trigger the alert - it doesn't trigger on a single failure.

Over the weekend, a single failure with the same rc=2 was triggering the alert.  I can't explain the difference in behavior, but it's time to move on.

0 Kudos