VMware Cloud Community
Number774
Contributor
Contributor

VM says CPU is maxed but host is nearly idle

I'm running on a Dell 2950 with 2 quad-core Xeons@2GHz and 8Gb RAM, Under ESX 3.5.0

Most of my VMs are single CPU, and run fine. I've got two quad CPU VMs, running WIndows 2003 server.

None of my VMs do much CPU work for very ong - they tend to do a few hours processing, then go back to sleep for a day or so.

The most recent one is one of the 2003 VMs. It's configured with 4Gb RAM, 4CPUs, allow it to use what CPU it wants. The heaviest workload it does is a Visual Studio build of about 50 projects. Because of the multiple CPUs I hoped that it would do as it does on a workstation, and use multiple CPUs to run the build. Well, it does - but...

At the moment Windows Task Manager in the VM is reporting 96% CPU usage, and 2.85Gb RAM committed. However, Infrastructure Client is reporting the entire system as using about 10% CPU, and having 600Mb RAM allocated to that VM. The VM also appears extremely sluggish when I try to remote desktop to it.

It does have VMWare tools installed - V 3.5.0, build 64607 - though I haven't rebooted since I installed it. (it didn't ask me to)

Any idea why the CPU granted to the client is so low?

0 Kudos
24 Replies
gary1012
Expert
Expert

Are you using the native perfmon applet? If so, Windows perfmon isn't a good indicator of the issue since time is synthesized on a VM. You need to check either esxtop or vmperfmon () to see if these CPU spikes are correct. Also, there are number of good docs available here and on the web. Just search on VI3 performance tuning.

Community Supported, Community Rewarded - Please consider marking questions answered and awarding points to the correct post. It helps us all.
0 Kudos
Number774
Contributor
Contributor

I don't think the tools I have are lying to me, there is too much data that ties together - VMWare isn't giving much CPU to the VM, but all the VM gets it is using.

A search on VI3 performance tuning gives me 3050 hits. I'll start on the first few, but any hint would be helpful!

Thanks

0 Kudos
gary1012
Expert
Expert

I'm not doubting that you have an issue; merely suggesting alternate tools to help with troubleshooting. I would start with esxtop and see if you've any CPU Ready issues. As for narrowing down the search list, try this site under troubleshooting and performance: http://vmware-land.com/Top_10_Lists.html.

Community Supported, Community Rewarded - Please consider marking questions answered and awarding points to the correct post. It helps us all.
0 Kudos
Ken_Cline
Champion
Champion

A search on VI3 performance tuning gives me 3050 hits. I'll start on the first few, but any hint would be helpful!

A popular topic, indeed! Since the document listing in this forum is a little "under the weather" right now, here's a list of documents (in no particular order) that should help you out:

Storage Performance Analysis and Monitoring

Understanding Performance

Ready Time

esxtop Performance Counters

VirtualCenter Performance Counters

Memory Performance Analysis and Monitoring

Storage System Performance Analysis with Iometer

Storage Queues and Performance

Network Performance Analysis and Monitoring

VMkernel Scheduler

Guest-based Performance Measurement

Understanding VirtualCenter Performance Statistics

Performance Monitoring and Analysis

Using Perfmon for esxtop-based Performance Analysis

Benchmarking

Time-based Measurements in Virtual Machines

Hyper-Threading on ESX Server

Co-scheduling SMP VMs in VMware ESX Server

Best Practices for Performance

Best Practices for IIS

Best Practices for Apache

Best Practices for Web Servers

CPU Performance Analysis and Monitoring

And, of course, as was already pointed out, the BEST place to start any VMware-related search is Eric's http://www.vmware-land.com

Ken Cline

Technical Director, Virtualization

Wells Landers

TVAR Solutions, A Wells Landers Group Company

VMware Communities User Moderator

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/
0 Kudos
Number774
Contributor
Contributor

I feel my hair coming loose. That has a collection of interesting sounding links - but all the ones I clicked on are broken. It looks as though VMWorld.com has changed their link structure since he wrote the page. Or is it just me that finds http://vmworld.com/vmworld/mylearn?classID=11026 doesn't go anywhere useful?

0 Kudos
gary1012
Expert
Expert

Unfortunately, most of the VMworld links are only useful if you have been to VMworld.

A couple of questions...

1. Do you really required 4 vCPUs? vSMP scheduling has been known to be an issue when the application isn't multi-threaded or if you're not using the full compliment of processors.

2. Do you have any shares/reservations that would compete/limit processor time for this or other VMs?

3. What values are you showing for CPU % RDY for the host and VM?

Community Supported, Community Rewarded - Please consider marking questions answered and awarding points to the correct post. It helps us all.
0 Kudos
Number774
Contributor
Contributor

>>Do you really required 4 vCPUs?

Yes. This is a build machine, compiling under Visual Studio. We have moderately complicated source; it takes the best part of an hour on a single CPU. On a dual CPU machine (my development box) it uses the 2nd CPU for parts of the build; this knocks back the build time by about 50%. More CPUs ought to help even more - up to a point.

>>Do you have any shares/reservations that would compete/limit processor time for this or other VMs?

Not on CPU. One other VM has a couple of Gb of RAM reserved; this one thinks it has 4GB. I am suspecting that I may be out of RAM though.

>>What values are you showing for CPU % RDY for the host and VM?

Here's a snapshot:

4:50:08pm up 23:10, 98 worlds; CPU load average: 0.30, 0.30, 0.22

PCPU(%): 2.52, 2.69, 2.20, 2.53, 26.58, 42.21, 9.11, 28.42 ; used total: 14.53

CCPU(%): 1 us, 0 sy, 99 id, 0 wa ; cs/sec: 315

ID GID NAME NWLD %USED %RUN %SYS %WAIT %RDY %IDLE %OVRLP %CSTP %MLMTD SWTCH/s MIG/s PMIG/s WMIGI/s CMIG/s QEXP/s WAKE/s AMIN AMAX ASHRS

1 1 idle 8 684.94 685.97 0.00 0.00 114.97 0.00 0.00 0.00 0.00 6.97 0.00 0.00 0.00 0.00 0.22 0.00 0 -1 0

2 2 system 6 0.05 0.05 0.00 600.00 0.00 0.00 0.00 0.00 0.00 2.25 0.00 0.00 0.00 0.00 0.00 2.25 10 -1 500

6 6 helper 22 0.01 0.01 0.00 2200.00 0.01 0.00 0.00 0.00 0.00 9.44 0.67 0.67 0.00 0.00 0.00 9.44 3 -1 1000

7 7 drivers 11 0.01 0.01 0.00 1100.00 0.00 0.00 0.00 0.00 0.00 3.37 0.00 0.00 0.00 0.00 0.00 3.37 3 -1 1000

9 9 console 1 1.57 1.57 0.01 98.16 0.38 98.15 0.06 0.00 0.00 107.46 0.00 0.00 0.00 0.00 0.00 81.61 10 100 1000

16 16 vmware-vmkauthd 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0 -1 -3

1133 25 vmware-vmx 1 0.08 0.08 0.00 100.00 0.01 0.00 0.01 0.00 0.00 13.71 0.22 0.22 0.00 0.00 19.11 12.81 0 -1 1000

1134 25 vmm0:mh-rg-buil 1 25.73 25.72 0.03 73.96 0.44 73.90 0.04 0.00 0.00 75.98 3.82 3.82 0.00 0.00 9.22 62.05 0 -1 4000

1135 25 vmm1:mh-rg-buil 1 31.55 30.86 0.00 68.88 0.38 68.81 0.05 0.00 0.00 95.99 14.61 14.61 0.00 0.00 4.50 67.89 0 -1 4000

1136 25 vmm2:mh-rg-buil 1 22.77 22.79 0.00 77.03 0.30 77.01 0.05 0.00 0.00 100.49 19.11 19.11 0.00 0.00 2.70 78.91 0 -1 4000

1137 25 vmm3:mh-rg-buil 1 24.53 24.54 0.00 75.16 0.42 75.16 0.04 0.00 0.00 98.02 20.01 20.01 0.00 0.00 0.90 73.96 0 -1 4000

1138 25 vmware-vmx 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.45 0.22 0.22 0.00 0.00 0.00 0.45 0 -1 1000

1139 25 mks:mh-rg-build 1 0.48 0.47 0.02 99.40 0.25 0.00 0.03 0.00 0.00 84.30 4.05 4.05 0.00 0.00 0.00 77.33 0 -1 1000

1140 25 vcpu-0:mh-rg-bu 1 0.02 0.02 0.00 100.00 0.00 0.00 0.00 0.00 0.00 11.02 2.47 2.47 0.00 0.00 10.79 10.79 0 -1 1000

1141 25 vcpu-1:mh-rg-bu 1 0.02 0.02 0.00 100.00 0.00 0.00 0.00 0.00 0.00 5.62 0.90 0.90 0.00 0.00 4.50 5.62 0 -1 1000

1142 25 vcpu-2:mh-rg-bu 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 1.35 0.45 0.45 0.00 0.00 0.90 1.35 0 -1 1000

1143 25 vcpu-3:mh-rg-bu 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 2.02 0.45 0.45 0.00 0.00 1.12 1.80 0 -1 1000

If the system is running out of RAM, and is swapping the VM in question, wil it show this symptom?

0 Kudos
gary1012
Expert
Expert

Swapping could be an issue, however more investigation will be required on your part. One of the publicly available docs on Ken's list might be the best source to build upon and it's the Performance Monitoring and Analysis doc located here: http://communities.vmware.com/docs/DOC-3930.

Community Supported, Community Rewarded - Please consider marking questions answered and awarding points to the correct post. It helps us all.
0 Kudos
jasonboche
Immortal
Immortal

To the OP:

You are suffering from a known issue in VirtualCenter 2.5.0. In a cluster with DRS enabled, the CPU usage of a VM goes through the roof after a VMotion. This is resolved as of VirtualCenter 2.5.0 Update 1

For more technical details on the "why", see VMware KB article 1003638. I went through this with our VI several months ago.

http://kb.vmware.com/kb/1003638






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

Message was edited by: Ken.Cline to shorten the URL (Shame on you Jason!)

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
esiebert7625
Immortal
Immortal

Actually that's not the case, I only post links for content from Vmworld that has been released for non-attendees. You simply create a free Vmworld account and you can see the content.

Eric Siebert

VMware Communities User Moderator

-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-

Check out my website: VMware-land

Read my virtualization blog: SSV Blog

-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-

0 Kudos
Number774
Contributor
Contributor

This sounds almost like the answer.

The memory allocated to the VM does crawl upwards when I fire off my build. So I went in, and reconfigured the VMOverheadGrowthLimit parameter to 4 - and almost immediately, according to the login vmkernel, vpxuser sets it back. The only thing is I don't have DRS enabled on this box! If I get the IT guys to update VirtualCenter to Update 1, will this enable me to change the parameter, or will it change it itself,or is there a better way?

Thanks

0 Kudos
Number774
Contributor
Contributor

You're right, it isn't the case. After your post I did a little more poking around - and in fact the problem is that the links don't work under Firefox, but only under IE!

0 Kudos
esiebert7625
Immortal
Immortal

I have no problem with either browser, the VMworld presentations all use Macromedia Flash so if you do not have it installed and configured for Firefox they may not work.

Eric Siebert

VMware Communities User Moderator

-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-

Check out my website: VMware-land

Read my virtualization blog: SSV Blog

-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-

0 Kudos
jasonboche
Immortal
Immortal

There are two fixes/workarounds. One involves making a configuration change on each ESX host. The other involves VirtualCenter and fixes the problem globally. I never tried the ESX host fix but the VirtualCenter fix worked like a charm and resolved the high CPU issues.

Jas






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
0 Kudos
Number774
Contributor
Contributor

I have Flash 9.0 r124 installed. I see the same behaviour on my machine at home too - it stops at the login screen, and after a login has lost any idea of where it might be going. I can live with it though...

(Update on the original problem - I'm going to sit down with the IT guys later and fiddle around with cluster settings. Unfortunately no-one still here was involved in cluster setup - we don't even know why my machine is linked in to the main IT VMWare setup - so we are a little cautious of fiddling, we don't want to risk taking the corporate domain controllers down!)

0 Kudos
Number774
Contributor
Contributor

Hi Guys,

Sorry for the slow update. I've finally managed to catch the IT guy, and I now have control of my machine!

We went to the root of the infrastructure tree, and made a new cluster, because one of the workaroundssays you can set this at the cluster level in the DRS tab. But when we tried to drag the server into the new cluster we got an error telling us we didn't have a licence. So instead we disconnected my machine from the tree. Now when I set the VMOverheadGrowthLimit value it sticks. All I need to do now is run some work through it.

Thanks

0 Kudos
Number774
Contributor
Contributor

Hi Guys,

an update on this for you. I gave up and went and bought some more RAM for the box. This took it from 8Gb to 20Gb, and it really took off. It's been running fine for the last month.

But...

As you know, Parkinson's law states that (anything you choose) expands to fill the space available. In this case, it's the workload has expanded. I've just fired up a couple more machines, and run out of RAM again. Guess what - the problem is back. It looks as though the ballooning process eats processr cycles - a VM that is idle, but is being ballooned, can end up using several gigahertz of CPU power. Surely this isn't normal behaviour?

0 Kudos
Ken_Cline
Champion
Champion

a VM that is idle, but is being ballooned, can end up using several gigahertz of CPU power. Surely this isn't normal behaviour?

No, not normal behavior. All the balloon driver does is allocate RAM from the guest OS (essentially does a malloc). It then releases the allocated RAM back to the hypervisor to satisfy other memory pressures. The only impact this should have on performance is that (potentially) the guest OS may need to use its own paging file to satisfy internal RAM requirements.

Ken Cline

Technical Director, Virtualization

Wells Landers

TVAR Solutions, A Wells Landers Group Company

VMware Communities User Moderator

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/
0 Kudos
Number774
Contributor
Contributor

I'm going to try an experiment. Ive got 20Gb of RAM, and I;m going to set it for tonight as follows:

4xVMs with XP, 1Gb each, reserved. These machines do a lot of hard work, but with any luck, not while the builds are running.

1xVM with server 2003, 3Gb, reserved. This is a file server for the 4 above but otherwise not very busy (but they do do alot of I/O!)

1x server 2003, 2Gb, reserved. Probably pretty idle

1xserver 2003, 512Mb, reserved, definitely idle

The 2 build machines, server 2003, 8Gb, none reserved.

Note the server VMs all have 4 vCPUs; the system has 8 physical ones @2GHz.

That's 9.5 reserved, plus a bit for ESX; the two others would take it to ~26, but they can't have it - obviously.

The two build machines will fire up a build, one at 20:00GMT, the other at 21:00. The two builds don't normally overlap - they take about 40 minutes. So I'd expect the 20:00 one to be idle by the time the 21:00 one cuts in. (when they are done, the 1st 4 VMs plus a load of other physical and virtual machines will test the builds. That takes hours.

Now as I write, the two build machines are sitting idle. I'll report back in the morning

0 Kudos