VMware Horizon Community
chicojr
Enthusiast
Enthusiast

Monitor and Alert on problem vms and hosts

Currently the environment we have is 6.0.1 build-2088845, apparently we like to walk into hornet nest. However I'm trying to find a way to alert us when we have problem vms or hosts in this environment and coming up short. We have an old version of Solarwinds (SAM) but again need to know what individuals use today to alert them and to stay proactive please.

5-9-2019 10-22-11 AM.png

Tags (1)
0 Kudos
4 Replies
mchadwick19
Hot Shot
Hot Shot

If you have Horizon Enterprise vROps is the way to go I would say. I used it in my current and previous job. It's pretty easy to get set up, but you would need to create your own alerts for the problem VM's with your own thresholds. Although with over 250 sessions and 4 problem VM's thats not too bad.

VDI Engineer VCP-DCV, VCP7-DTM, VCAP7-DTM Design
0 Kudos
chicojr
Enthusiast
Enthusiast

Yes, thank you we have vrops std and horizon view ent if you happen to have directions great.

0 Kudos
BenFB
Virtuoso
Virtuoso

It will depend on the type of problems you want to be alerted on. vCenter has some built-in functionality for alarming that might work for you. Otherwise you will need to look at a third-party monitoring tool or write your own PowerCLI/View API (Horizon 7.x) scripts to monitor.

If you aren't aware, Horizon 6.x goes End of Support in less than a month on 2019/06/19. Please plan an upgrade to 7.x ASAP.

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/support/product-lifecycle-matrix.p...

0 Kudos
mchadwick19
Hot Shot
Hot Shot

This is quite long so bear with me.

The issue with View and vROps on the problem desktop front is problem desktops are defined as an agent having a status that falls within a table of many different statuses.

pastedImage_0.png

What you need is to create a super metric with this definition. This is all the statuses from the image above, you don't need this full list. The big ones you really need are Agent Error, Protocol Failure, Need reboot, unreachable, and already used. There are a few more that you can add (unknown, missing, etc).

sum((count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_config_error_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_disabled_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_domain_failure_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_invalid_ip_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_need_reboot_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_protocol_failure_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agent_err_startup_in_progress_status, depth=1}))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|agentunreachable_status, depth=1})))+(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|alreadyused_status, depth=1}))

To add a different metric to this list, start at the end, go in one ) and do +(count(${adaptertype=V4V, objecttype=ViewPool, metric=desktop_vms|<metric name>, depth=1}))

So what this is doing is taking a count of each of the different agent statuses and combining them together in a single metric which I labelled in vROps "Problem Desktops. You then want to apply this super metric to the "VDI Desktop Pool" object type. You're not done yet...

You then want to apply this super metric to the "VDI Desktop Pool" object type and enable it in your policy set.Select Administration > Policies > Policy Library > Select whichever is the active policy > Edit > Collect Metrics/Properties > Search for Problem desktops > enable "State", if you want "KPI" "DT" for the one with the "VDI Desktop Pool" object type. This will take a little while to start showing up in your environment, so I'd suggest wait an hour or so to get some good data in vROps.

Now you need to create your symptoms to define when this metric reaches warning, error, critical thresholds. After that you need to create your alerts that will be triggered when the thresholds are met. Didn't write instructions for this because these are pretty straightforward and well documented. However the super metrics piece is a real PITA.

If you have any questions - let me know!

VDI Engineer VCP-DCV, VCP7-DTM, VCAP7-DTM Design
0 Kudos