VMware Cloud Community
StephenMoll
Expert
Expert

High Co-Stop Alarm

Is it possible to setup and alarm for occurrences of excessively high Co-stop?

Furthermore is it possible for these alarms to be sent as Syslog messages?

8 Replies
StephenMoll
Expert
Expert

Is there a convenient way of polling the hosts for metrics on costop and effectively capturing occurrences of excessively high costop?

0 Kudos
jatinjsk
Enthusiast
Enthusiast

1. in vCenter there are not much counter to set the alarms related to CPU and there is no alarm definition for co-stop specifically.

2. If you have vRealize Operations Manager than you can create this alarm on a virtual machine. vRealize Operations Manager 6.2 Documentation Center  Check this out and there is a specific alarm definition for HIGH CO-Stop. From here you can either push these alarms to vCenter or Your ticketing system.

3. No you can not pass this information to syslog.

0 Kudos
daphnissov
Immortal
Immortal

To give my +1, you need a larger operations tool like vROps to accomplish this. Once you have it, it's a trivial matter to set something like this up.

0 Kudos
StephenMoll
Expert
Expert

Unfortunately it is not something I have available, or even likely to get. I am hopeful we might get a licence or two for vROps to use during development and integration, but it is unlikely to be something that will be available in a deployed system.

Continuous monitoring of %CSTP and %RDY is probably not necessary. All our workloads are in fixed chunks. Each chunk is a hosts worth of workload. Every VM has a vCPU reservation and a limit. They are actually the same value, so in essence every VM gets what it needs and no more. The sum total of these reservations is calculated so that no workload is more than 88% of the vCPU resources available. So I am hoping that once system integration is completed all likelihood of having instances of excessive %CSTP or %RDY will be pretty much eliminated.

I was wondering if there was anything in vSphere or ESXi itself that records occurrences of high %CSTP or %RDY, and whether or not this might be used to indicate that a performance issue may have occurred and log this for later analysis. I might be wrong, but I thought ESXi recorded stats, that can be harvested by the vSphere performance monitors every 20 seconds or so, and imagined that this might be a way to do this. We have our own application managing the clusters, which could possible be coded to support this, and create the syslog messages, provided the metrics are there to be gathered in the first place.

0 Kudos
daphnissov
Immortal
Immortal

There's nothing in ESXi that would journal those messages as it's the responsibility of vCenter. The problem is that you may need to increase performance stats levels to get what you need for the period you want, then go about pulling that from vCenter. At that point, you're assuming a lot of technical debt to essentially re-create the wheel versus dropping in a tool that is far more capable.

0 Kudos
StephenMoll
Expert
Expert

I'm not disagreeing, about vROps I wish I had it now and knew that it would be available on the delivered systems. I have determined however that I shouldn't hold my breath.

In the meantime, the other measures described should hopefully support an argument that the chances of %CSTP or $RDY being too high are so small to be unworthy of concern. The systems will have been tested to within an inch of their lives prior to delivery anyway. This should flush out any problems in advance such that monitoring post deployment is deemed an unprofitable exercise.

0 Kudos
jatinjsk
Enthusiast
Enthusiast

By default you have 2 alarms for CPU and memory usage. Most of the time these 2 alerts are more than sufficient to get an idea about ESXi utilization. High resource utilization may result in resource contention but all subject to different factors though.

Most of the time Monitoring default resource utilization alarms helps you to get an overview of possible performance issues in your virtualized infrastructure.  

StephenMoll
Expert
Expert

Worth looking into. So log instances of very high host CPU utilisation on the basis that when this happens there is a likelihood of high %CSTP and %RDY. To determine this for fact would involve closer examination using ESXTOP.

0 Kudos