I am looking at my ESXTOP output and from what I can see, my CPU load is OK, and my %PCPU is OK. My question is can anyone tell me what this SYSTEM metric for %Ready is and why it is so high compared to my VMs which seem to have an acceptable %RDY? I am having issues with "pokey" VMs. What I am noticing is that my %VMWAIT seems to spike frequently for a VM but my CPU utilization seems to be acceptable? I know I am missing something here. Thanks.
2:36:27pm up 5 days 16:04, 671 worlds, 3 VMs, 14 vCPUs; CPU load average: 0.21, 0.22, 0.24
PCPU USED(%): 20 22 30 1.9 33 3.2 6.7 28 5.0 31 3.6 32 33 0.1 28 1.3 0.2 0.2 0.3 30 28 0.0 32 0.0 15 0.0 11 0.2 7.9 0.1 31 0.0 AVG: 13
PCPU UTIL(%): 19 21 30 2.5 31 3.9 7.4 27 5.4 29 4.0 29 30 0.2 25 1.4 0.2 0.2 0.3 28 26 0.1 29 0.1 13 0.1 10 0.3 8.5 0.1 28 0.1 AVG: 13
CORE UTIL(%): 39 31 33 32 33 33 30 26 0.4 28 26 29 13 10 8.5 29 AVG: 25
ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP %CSTP %MLMTD %SWPWT
13679 13679 AAAASERVER02 16 153.50 140.51 0.05 1468.00 0.43 0.13 362.65 0.18 0.00 0.00 0.00
133816 133816 ZZZSERVER02 14 130.42 120.65 0.09 1285.00 16.59 0.11 264.48 0.14 0.00 0.00 0.00
13664 13664 PP01 15 54.34 50.21 0.11 1457.78 2.39 0.05 450.44 0.12 0.00 0.00 0.00
1 1 system 298 1.12 2875.21 0.00 26664.11 - 341.75 0.00 0.93 0.00 0.00 0.00
%VMWAIT could indicate disk or network performance issues as its waiting on the kernel to complete an action, have you checked your storage in esxtop for latency etc?
I have checked my DAVG and KAVG and they look great. Running way less than 10ms.
I did expand System and notice that is mostly IDLE time.
What I cannot understand is why my %VMWAIT spikes like it does. I can watch and see a VM spike to values from 16-50%. They drop quickly, but it still seems odd with the CPU utilization I am seeing.
Its a little challenging to troubleshoot remotely, has your host had memory contention issues recently? If so, check the memory view and see what SWCUR is showing.... the VM may have some memory swapped still (even if the host has returned to normal), this could lead to a temp VMWait spike while it has to read specific pages from disk
SWCUR is at 0.00.