We have vROPS 8.1.0 and NSX-T 3.0.1 in our new environment. In vROPS we have enabled Notification for only critical and immediate events and we noticed repeated alerts are triggered from two alert def "NSX-T Management service has failed" and
"Management service monitor runtime state has failed" for the below services. Those are the services which is not running by default and we do not want to enable the services for now.
We tried creating a symptom def as below and assigned to a new policy and disabled the original alert/symptom def in that. However, the alerts are not stopping. I may be doing wrong as I am very new to this product. Please advice.
My advice for you is that use Policies to control all the alerts. However, you should only have one active policies. Another can be added if you have vRA or a VDI environment. Other than that there is no special need for multiple active policies as you will have confusion on how to manage alerts, metrics, etc.
With that being said if you have your notification to have all critical and immediate alerts, this is already a bad idea, as it would only cause people receiving the alerts to filter them in folders, so now it will never reach their inbox.
Third, disable the main alert in the policies and then create a new and modify if till you get it right. This involves logic and a lot of testing. Even thou the symptom is info, the Alert definition you attached it to could be set to critical, which will override the info.
You mentioned about having only one policy but what if you have different thresholds on different objects of object types at different sites?
If you have 630 host systems at site A and 840 host systems at site B, wouldn't you have different policies in this scenario?
The reason I am asking is I have multiple policies for different equipment and I am concerned I may not be getting alerts that I should.
The final statement you made about turning off alerts on main policy, you mean the default policy.
@lannguyen As you said as 3rd option I already created a new symptom def as info and assigned it to new alert def. I also disabled the default NSX-T service monitoring alert and symptom def in the new policy. However, it keep sending notifications. I'm looking here if somebody has info of disabling particular NSX-T manager service in vROPS.
If the alerting and metrics are the same on both site A and site B then no need for two policies. If Site A and Site B have different alerts then yes you will need two policies.
Turn off the alert for any polices with the active on it. The default policy varies by name. Any policy can be the default, it will have the letter D next to it instead of a numeric value
Well this is how I would do it,
1. disable all the alerts for all Active Policies. Make sure the alerts no longer there.
2. Clear up all the old alerts by going in and cancelling them. Wait about an hour and see if any comes back. If they do come back, its because you haven't disabled the alerts properly.
3. Now that you are sure that the alerts are disabled for good. Then create a new alert with only the services you want attached to it as symptom. Make sure that alert is active in the policy.
Looking at your screenhots, the state is not local which means it won't alert at all
@lannguyen I appreciate your response. My first query in this post is all about your suggested steps and I'm looking forward for some assistance in creating right symptom/alert def to suppress this notification
"We tried creating a symptom def as below and assigned to a new policy and disabled the original alert/symptom def in that. However, the alerts are not stopping"