Trouble disabling (replacing) a disk space alert i...

JoshBarfield · ‎02-04-2016

Hello! We have several VMs with a drive where capacity pretty much needs to be ignored. I've tried to create custom symptoms, a custom alert, and applied that alert in a policy to the custom group containing those VMs. However I still see one of the default alerts appear for VMs in that policy. I've dug around and changed settings many times but have not found what I'm missing. Any help is appreciated!

Image 1 - Here you see that my VM has the Self Managed SQL policy applied, and is showing an alert

Image 2 - The alert info shows Guest file system space usage at immediate level - Guest File System stats F:\.

Image 3 - Clicking the Go to Alert Definition button at the top near Summary shows me the alert is One or more virtual machine guest file systems are running out of disk space. For some reason, this alert is triggering on my policy and I don't want it to.

3 - False alert definition.PNG

Image 4 - This is my custom Alert Definition Self Managed SQL guest file systems are running out of disk space. It contains symptoms for warn/critical on C:\, D:\, and a 98% info symptom for F:\ (which is the drive I want to mostly ignore).

Image 5 - Here are my Symptom Definitions which are pulled into my Alert Definition. Nothing fancy. Warn/critical at 90/95% for C:\ and D:\ and then info at 98% on F:\.

Image 6 - Now let's look at my active policies. Self Managed SQL is there with priority 1, so if there were any conflicts, I believe it would take the lead.

Image 7 - Here's the policy tree.

Image 8 - Now let's get into my policy. First, the Metrics and Properties.

Image 9 - Next, Alert Definitions in the policy. Enabled my Self Managed SQL alert and disabled four others, including One or more virtual machine guest file systems are running out of disk space which we saw active in image 2 above!

Image 10 - I've left Symptom Definitions set to inherit. My understanding is that if the Alert Definition is disabled, the Symptom Definition won't matter. I've tried disabling these as well and have not noticed a difference.

Image 11 - And finally I apply the policy to the Self Managed SQL Servers custom group. That group is is of type Environment and contains members defined by Virtual Machine Object name.

Wayne1j · ‎02-04-2016

*Created on 1/27/16.*

I'm assuming this is before you updated your policy.

Have you tried clearing the alert manually?

When I set up my custom disk alerts the existing ones stayed active until I restarted the monitoring system.

JoshBarfield · ‎02-04-2016

Yes, I've cleared the alerts manually and they pop back up on the next refresh. When I added my custom alert for Log Insight disk space, the alert immediately cleared itself and hasn't shown since. I was expecting the same behavior here so I haven't tried restarting anything yet.

simonea · ‎02-05-2016

On image 3 "One or more virtual machine guest file systems are running out of disk space"

This would show that the policy in effect on the VM is still using that alert definition, have 92%>90%

Have you tried either removing the immediate and warning symptoms and or modify the % they trigger on in that policy?

JoshBarfield · ‎02-05-2016

I agree, it's like the VM has my policy applied, but is still pulling from alerts from the default policy.

I've set the Guest file system space usage at immediate level symptom to locally disabled for this policy and after clearing the alert it pops again on the next refresh. I then set Guest file system space usage at immediate level to locally enabled for the policy with an override set at >99 threshold. Cleared the alert, and it popped again.

Image 12 - Guest file system space usage at immediate level symptom locally disabled

Image 13 - Guest file system space usage at immediate level symptom locally enabled and overridden to >99

13 - Enabled Guest file system space at immediate usage level.PNG

JoshBarfield · ‎02-08-2016

Today a VM hit 99.297% full on the F:\ drive, triggering my custom Guest File System Stats F:\|Guest File System Usage (%) while still also showing the Guest file system space usage at critical level default alert. So I know my custom alert is working, but disabling the defaults is not, and attempting to override the defaults also does not appear to be working.

JoshBarfield · ‎02-08-2016

Similar to another issue I was having related to tags, I tried rebooting the nodes in my cluster and after everything was back online the erroneous alerts were gone. I'll give it a day or two before I confirm it's 100% cleared, but this seems to have fixed it.

kaufmanm · ‎02-15-2016

Yes, this was my first thought reading your OP. In my experience with alerts and symptoms in a multi-node cluster, behavior frequently won't change until you restart the cluster. Which is really frustrating when that takes so long and you're not sure exactly how you want to set something up. I believe I ran into some situations where applying the change while logged into the master node worked better than while logged into a data node, so I try to make changes like these while logged into the master, rather than the load balanced
URL I distribute to users.

All

Trouble disabling (replacing) a disk space alert in vROps 6.1