Solved: VM will only use 50% of allocated vCPU resources

Thibault · ‎11-08-2010

I have a windows 2008 64bit VM with 4 vCPU's that seems to only ever use up to 50% of combined CPU resources. Each CPU seems to be evenly balanced but will never go beyond 50%. I don't see and setting such as a resource allocation limiting the amount of CPU given. I am running this VM on a dedicated ESXi v4.1 host. Doesn't seem to be any contention. Any ideas why it seems to be limited to just 50% of the available CPU? There are no other VM's running on this host at this point...

RParker · ‎11-08-2010

Doesn't seem to be any contention. Any ideas why it seems to be limited to just 50% of the available CPU?

Now you see one of the MANY reasons why multi CPU VM's are a WASTE! This is what we try and illustrate, thanks for proving our point!

It is APPS that drive CPU, if the APPS are not multi threaded OR SMP, you can't take advantage of CPU unless the APPS are SPECIFICALLY written for it.

In a VM, it does a great job of demonstrating how inept apps really are....

View solution in original post

RParker · ‎11-08-2010

Doesn't seem to be any contention. Any ideas why it seems to be limited to just 50% of the available CPU?

Now you see one of the MANY reasons why multi CPU VM's are a WASTE! This is what we try and illustrate, thanks for proving our point!

It is APPS that drive CPU, if the APPS are not multi threaded OR SMP, you can't take advantage of CPU unless the APPS are SPECIFICALLY written for it.

In a VM, it does a great job of demonstrating how inept apps really are....

NuggetGTR · ‎11-08-2010

Yeah, you will find that what ever the app is your running on that box can only take advantage of 2 threads which is 50% of the cpu resources you have allocated.

________________________________________ Blog: http://virtualiseme.net.au VCDX #201 Author of Mastering vRealize Operations Manager

ThompsG · ‎11-08-2010

In addition to this (and no I was not smoking anything at the time) I have had a vm that did not appear to be able to consume more than 50% of the CPU. Had to cold power the vm off and then on again to get it to return to normal.

Kind regards.

J1mbo · ‎11-09-2010

It would seem that your app has only 2 threads, hence the relative 50% limit. Be careful with 4 vCPU guests, as they can cause CPU scheduling delays on many hosts. In general the recommendation is to allocate a maximum of half the number of physical processor cores to any one guest, but if there were two such guests there would be no free CPU resource for underlying ESX functions (for example iSCSI, NFS, memory management etc etc).

http://blog.peacon.co.uk

Please award points to any useful answer.

Unofficial List of USB Passthrough Working Devices

Thibault · ‎11-09-2010

Thanks everyone who has responded so far. I really appreciate the help.

When we watch task manager we can see 4 different processes at the top of the list - together using only about 50% of the CPU. With 4 processes, we would expect it to fully utillize all 4 processors near 100% usage. This all came about as a result of my customer not being pleased with his VM performance. He then installed the application on an older physical server and it completes this simulation in less than 1/2 the time of the VM. For testing purposes we moved the VM to a standalone ESX host to be certain we weren't seeing contention with other VM's for CPU resources. Is it valid to think that because we see four processes related to this applicaiton concurrently running in task manager that we would see all 4 CPU's in action and not be limited to 50% usage by VMware or Windows? It is strange to me to see all 4 vCPU's all at or about 50% used for the duration of the test...

J1mbo · ‎11-09-2010

What processor(s) does the host have?

http://blog.peacon.co.uk

Please award points to any useful answer.

Unofficial List of USB Passthrough Working Devices

Thibault · ‎11-09-2010

The ESX host is a Dell PowerEdge R710 - dual socket quad-core processors with hyperthreading and VT enabled - Intel E5540 @ 2.53GHZ

jwnchoate · ‎11-09-2010

Have you tried giving it a higher cpu share or reservation?

What does the CPU ready values show?

Check for ballooning memory too.

You may want to consider turning hyperthreading off, VMware may be putting your vcpu's on same core and hyperthreading them, cutting your allocation in half.

CPU affinity as a test may help.

I would like to hear your results.

However, over allocation of CPU resources is a problem when you get into multiple vm's. The more CPU's the harder it is to schedule it in a busy host. Especially of the CPU allocation is even close to the number of physical cores.

I have app folks who always wanted 4-8 vcpu's on an 8 core host with 10 VM's. Always drives me crazy. When we got the 24 core blades, CPU ready plummeted by 1000%.

Thibault · ‎11-09-2010

At the moment this is the only VM running on that ESX host, would think shares would not come into play. I am not using reservations but have defined resource limit of unlimited.

%RDY counters are very low to none

no ballooning

Thought about turning hyperthreading off but in the end we would like to get the most out of this ESX servers Would also mean that we would need to turn it off on all hosts in the cluster in the event that the VM was migrated (although may be a good exercise for a data point). Is there a way that I can tell how much time a vCPU is actually running as a hyperthread? Might be worth a test just to verify. Weird thing is it seems like VM is not leveraging the CPU weather logical or physical at this point beyond 50%. In addition it is basically exactly 50%.

Not sure what CPU affinity settings to make. I would hope that the scheduler is trying to use all cores accross mutliple sockets (there are only two sockets).

Yea I hate going with the SMP VM's, it adds complexity and reduces my consolidation ratios..

jwnchoate · ‎11-09-2010

I suggested affininy, so you could try to have an alternative to shutting down and resetting hyperthread as a test. You can set affinity to spread out the cores as efficiently as possible so to force spreading the load across both cores and see what happens.

As for the customer putting their app on another box and it runs fine...thats a different install, wonder if there are any app settings that get in the way.

This is a real-world example I had a few years ago. An oracle database was running 2 copies of a database (different data for 2 different biz units, but the same app 2x). One db was running slow, the other was fine. Vendor darn near flat refused to support app on a vm, kept insisting that we run physical. Since mgmt was not going to purchase hardware, I spend a long time playing with resources. While I was able to make some minor improvements with some settings such as hyperthread off, isolated on a host, etc. nothing ever changed dramatically. I had to press back on the app guys as to why one db seemed ok and the other did not.... After nearly a YEAR, the dba and vendor started to go through db structure and found that a table had been added to the badly running db that was not in use. After fixing that, it ran fine. Not exactly your same problem, but it can show you that the application and OS configuration can make a difference.

jwnchoate · ‎11-09-2010

Also, if your CPU ready is very low, that is good, its getting all the CPU its asking for. If the vm is only asking for that much, then I would think to look inside the vm itself.

We pull out perfmon a lot in vm's. Might have to dig a little deeper into the vm itself.

J1mbo · ‎11-09-2010

The Oracle example is very sadly so very common in the Oracle application vendor space. Horrid work ethics and mindset, and idiotic pricing with it.

http://blog.peacon.co.uk

Please award points to any useful answer.

Unofficial List of USB Passthrough Working Devices

Thibault · ‎11-09-2010

When monitoring the guest o/s via perfmon we see a large proportion of CPU time spent in priviledge mode, actually it is fairly high on the physical as well but much higher on the VM. Roughly 80% of the CPU time seems to be spent in priv mode as opposed to physical server which is around 50%. There is very little paging occurring. Having trouble understanding why so much time is being spent in non-user mode. It is possible that a lot of floating-point calculations are being done - does a VM get direct access to Floating point unit or does hypervisor capture and do s/w translation? If so does that count toward priv mode or user mode?

jwnchoate · ‎11-09-2010

How fast is your storage, your performance issue may not be your CPU.

When we first started with vmware, most vm's were not disk i/o intensive, then we tried to virtualize one server that was and it ran like a dog. We ended up doing massive i/o benchmarks and several SAN upgrades to get performance near what physical hardware can give.

if most of the CPU time is privileged mode, then it could be waiting on disk i/o (not paging, but file access). Saw this kind of think with a database once. Turned out processes were waiting for the i/o to complete before moving on.

disk Q length, disk reads/s, disk writes/s, %disk times, all these can really slow you down.

I had to use a tool called iometer and spent a bunch of time to get our system cranked up.

from what you've posted before, if CPU is 50%, and there are low ready times, it doesnt sound like a CPU problem. now you mentioned privileged time and that sounds like your processor is waiting for something. Took me a while to realize, high cpu times may have nothing to do with executing program instructions, but more about running the 'check loop' looking for the information requested from disk and other sources.

ThompsG · ‎11-09-2010

Sorry to be butting in here, but just like to confirm you have tried shutting the virtual machine down and restarting it? We had a similar issue and the only way we found to resolve it was to power off the vm.

Before people mention that the application was not multi threaded, in our case it was a single vCPU machine and even running cpubusy.vbs would not take us over 50%... shutdown, power on... 100% CPU.... wierd I know and have not seen it again.

Kind regards.

J1mbo · ‎11-09-2010

It's certainly worth trying. Also snapshots can cause odd CPU usage in some situations.

http://blog.peacon.co.uk

Please award points to any useful answer.

Unofficial List of USB Passthrough Working Devices

Thibault · ‎11-16-2010

We have shutdown the VM several times and unfortunately no change..thanks

RParker · ‎11-16-2010

Have you tried giving it a higher cpu share or reservation?

Share / reservations do not matter.. UNLESS the ESX host is 90% utilized and it has to decide who has priority, thats the ONLY time shares have an affect.

If there are enough CPU / Memory for ALL the VMs and there aren't any limits in jeopardy, you can set one to a million shares and one with 1 share, BOTH VM's will receive identical time slice and memory.

RParker · ‎11-16-2010

We have shutdown the VM several times and unfortunately no change..thanks

Stubborness will be a great teacher for you.

I posted a reply, you casually ignored, that's your prerogative.. your VM still doesn't work. reduce the CPU to 1. Since it's quite obvious it's NOT going to use what you allocate, why waste it? More cpu allocation means higher overhead on the ESX host.

What do have to lose?

All

VM will only use 50% of allocated vCPU resources