I've been using vCOPs (Enterprise vAPP) for a good while now with fairly mixed results. One of the issues that is stumping me right now is monitoring the status of the vCOPs VMs and Services themselves. I have had a couple of instances where vCOPS has just stopped logging data or where the main web interface is not responding and I have only become aware of it when myself or my colleagues have attempted to use it.
So what I'm really asking is what do you guys do to monitor the general health of vCOPs itself? Does anyone use Xymon/Hobbit or Nagios type client software installed the Linux VMs? Does anyone monitor the actual /vcops-vsphere, /vcops-custom or /admin pages? Or do you use another strategy altogether?
Thanks in advance
Do you know that vcops has some built-in metrics and alerts on that?
e.g. self monitoring
those are available to look at on the "About" page in vSphereUI
also, alerts are being generated when free-space on either VM becomes less than 10%
What gradinka said - ever since ~5.7ish (maybe 5.6, but I've forgotten) the self-monitoring resources and relationships have included OS and app-level monitoring on the vApp VMs.
If you want view the metric more 'freely', open up the Custom UI and find the vCenter Operations Manager Adapter Instance resource. This includes a real nice hierarchy of the application tiers and subsequent components. The data is very good and is usually more than you need to troubleshoot the GuestOS and application.
vCOPs monitoring itself is all well and good up until the point that it dies and is no longer able to report/alert. What I am really after is the ability to monitor that state of the critical services that vCOPs runs but from another system (like Nagios or Xymon etc).
Self-monitoring will only go so far IMO.
This is probably only useful for a small number of people, but on my blog I have documented how I now monitor the vCOPS vAPP general status using the Self Monitoring option, using vCenter to alert and installing the Xymon/hobbit agent for reporting into a Xymon monitoring server.
I reckon you could easily do the same for Nagios but i haven't had the time to try this.
Anyway, shameless self-promotion link below