I am struggling to understand why my alert fired. After reading the Alert States and Lifecycle, I notice that I'm using an Aggregate Function. So I suspect that the answer relates to item b under "I don't think my alert should've fired. Why did it?" in Alert States and Lifecycle, but I'm not sure how. I performed the following steps:
- Modified the time period I'm looking at in the Wavefront graph to select 5 minutes, allowing me to concentrate on only the period of time just before and after the alert.
- Put my cursor on the alert. From here I could see details indicating that the host First Affected by this alert was "web0228".
- Changed Chart Type from "Line Plot" to "Point Plot" so I could more easily see what values were being reported. From this I could see that there were points in time during which data was not received.
- Changed the Aggregate Query being used into 3 separate individual queries, so that I could see what actual values were present. The values in question are HTTP response code values. The Aggregate query is looking for any value above 400. I noted from this that the maximum value received during the time period in question was 204. The Aggregate Query was configured to use ">= 400". So we didn't receive any values that exceeded the 400 threshold, and yet the Alert fired.
- Modified the query to append the string cname=web0228... And unticked the 2 individual queries that were unrelated to this alert firing. This allowed me to focus only on data points for this host. But I didn't see any unusual pattern here. Data was being reported from what I can see at a standard cadence.
- I looked at the configuration of the Alert in question. I noticed that the Alert History shows the alert was modified the day before it fired such that Alert minutes was updated from 10 to 15. I don't think this is relevant.
- I noticed that when I choose Backtesting in the Edit Alert page, it does not indicate that an alert would have fired at the time it did. Indicates perhaps data was received late, and interpolation was being used?
I'm missing something in my understanding here. But what?