VMware Cloud Community
Padone
Contributor
Contributor

vCPU charts versus perfmon (yet...)

Hi all,

I'm in charge of administration of a dozen of Windows hosted by an ESX. Windows perf are monitored by System Center Operation Manager. Those machines were P2V, dual proc, and host an business application. I have no idea of the process used for the P2V (I'm new in the company). Those Windows used ~25% (average) for the counter % Processor Time _Total (SCOM and Perfmon).

I have no problem with those computers but my VM team came this morning and they told me I have a serious CPU issue on my VMs. They show my their vCPU graph from VI3 of my poor computers and... It shows an average utilization of 60% and sometimes up to 90%. What's wrong ? My computers seems to be ok, perfmon agrees, and application users don't encounter trouble neither.

I search some tips over the Internet and I found similar case for bad P2V (on multiprocess) process. But my VM team said, "no no newbee we follow the right process, downgrade to one process then P2V then migrate etc...". So, who is right?

Should I believe perfmon? What about the vCPU charts?

Is it a known case on ESX? If yes, is there a KB or something like that I can use to repair my Windows or show to the VM team? How can I be sure ... and not fired if I said "hey guys today Windows is right Smiley Wink you screwed up!" Please advice...

Thanks

0 Kudos
1 Reply
Erik_Zandboer
Expert
Expert

Hi,

First of all check the number of vCPUs. Make sure that the processor HAL matches (2 or more vCPUs -> multi HAL, single vCPU uniprocessor HAL). Try and check how the perfmon relates to the vCPU usage you see in the VI client. I have seen P2Vs (windows 2000 machines) who show no real perfmon troubles, but one of the cores simply shoots to 100% in the VI client. If you see a mismatch (vCPU versus perfmon CPU), you want to start looking further. If you are a graphical guy, check out the other CPU related values you read from the VMs (realtime span). If you do not know what to make of it, check out:

If you are a commandline guy, check out esxtop. Start with CPU ready and busy times. This should give an indication. It might simply be a limit put on the vCPU, a share-mismatch compared to the other VMs vCPU shares, a process gone wild or badly progammed (eg running in protected mode). You should always be able to get to the bottom of the problem using these tools.

I would suggest to pick a misbehaving VM, and zoom in on that one. Share its settings with us (also the resource pool it might reside in), any limits, reservations set, number of vCPUs etc. Then grab the CPU ready, busy times etc.

Visit my blog at

Visit my blog at http://www.vmdamentals.com