BHaskett
Contributor
Contributor

ESXTOP Thresholds

At this link at Yellow Bricks: http://www.yellow-bricks.com/esxtop/ are those thresholds intended to be a spike value or sustained value? If sustained, how long before you would call it something to investigate? For instance, at 10 second intervals, my DAVG, GAVG, KAVG, and QUED all Averages are within the specified thresholds. However, I have had a KAVG ten second spike of 20, DAVG ten second spike of 30, and GAVG ten second spike of 40. However, none of them remained over the threshold for more than one ten second interval. Is this cause for concern?

Also, when using ESXPLOT, do you look at the Physical Disk Path DAVG, GAVG, KAVG, and QUED numbers, or the Physical Disk Partition DAVG, GAVG, KAVG, and QUED numbers?

0 Kudos
4 Replies
vmdkness
Enthusiast
Enthusiast

The bottom of this page has the translations from esxtop to esxtop batch mode

http://communities.vmware.com/docs/DOC-9279

0 Kudos
BHaskett
Contributor
Contributor

I saw that, but it did not specify Path vs Partition as it is listed in ESXPLOT. Any ideas on spikes vs sustained?

0 Kudos
BHaskett
Contributor
Contributor

Any other thoughts on the graphs I posted and whether or not those thresholds are intended to be a spike value or sustained value?

0 Kudos
vmdkness
Enthusiast
Enthusiast

My experience when working with VMware escalation engineers on this topic is that they focus on the "Dev" entry in esxplot this is the LUN and the value is and average over the interval measured.

There is not supposed to have peak averages over 25 milliseconds.   That being said if a response is not received in 5000 milliseconds i/o is halted.   We had major issues but our latency numbers were over 100 milliseconds.

0 Kudos