High host CPU utilisation from idle Linux VM's ?

Tigerstolly · ‎01-07-2010

5 of our RHEL 5.4 32bit 1 vcpu VM's (out of 30 or so) are constantly consuming 90-100% of underlying physical host CPU. In the guest itself the VM is idle.

We have the latest tools. We have tried stopping the tools, moving the VM to another host, restarting the VM's and checking that / is not full on the guests.

ESXTOP confirms the high CPU use seen in vCenter.

Before i open a ticket, is there anything i can try ?

Tigerstolly · ‎01-07-2010

So after a bit of googling i found this. This seems to have helped on the one VM i have done it to so far.

http://www.g-loaded.eu/2009/12/18/high-cpu-usage-centos-guest-virtualbox-vmware/

dnetz · ‎01-07-2010

The timekeeping best practices KB (http://kb.vmware.com/kb/1006427) linked to in the blog article is a great help and tends to fix both high cpu usage as well as weird application behavior and errors on linux VM's, so your solution is probably dead on. It differs between linux kernels though so read it carefully.

bnelson · ‎01-07-2010

If I assume you are running ESX4 and the ESX hosts are 64bit multi-core, why are you only assigning 1vcpu to each VM? I was seeing the same issue until I allocated not less than 2 vpcus per VM. Look at your threads and processes. The multiple CPUs assigned to Linux will solve the problem more effectively than any kernel hack.

All my production VMs, Windows Server or Linux, get not less than 4 vcpus by default.

Brian Nelson

Hang 2 LEDs in the datacenter. The students are coming! The students are coming!

DSTAVERT · ‎01-07-2010

Assigning multiple vCPUs as a default would not fall under best practices. While processor scheduling is much more relaxed than in previous versions you will loose VM density by over committing vCPUs and you do risk reduced performance across the host.

-- David -- VMware Communities Moderator

bnelson · ‎01-08-2010

Quite a few assumptions and blantant generalizations which I think need to be addressed:

Assigning multiple vCPUs as a default would not fall under best practices

Best practice? Whose? Many best practices are set by the OEM in order to reduce support costs and complexity.

you will loose VM density by over committing vCPUs

Depends on the VMs, depends on how the resource pool resources are limited, depends on the workload characteristics of the VMs, depends....

risk reduced performance across the host

Again, depends on the VMs, how your workloads use the resources, how you allocate the resources to the VMs.

Brian Nelson

Hang 2 LEDs in the datacenter. The students are coming! The students are coming!

dnetz · ‎01-11-2010

Going waay offtopic but can't really help myself on this one.

Best practice? Whose?
http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf - page 19

Use as few virtual CPUs (vCPUs) as possible. For example, do not use virtual SMP if your application is single-threaded and will not benefit from the additional vCPUs.
Even if some vCPUs are not used, configuring virtual machines with them still imposes some small resource requirements on ESX:
Unused vCPUs still consume timer interrupts.
Maintaining a consistent memory view among multiple vCPUs consumes resources. Some older guest operating systems execute idle loops on unused vCPUs, thereby consuming resources that might otherwise be available for other uses (other virtual machines, the VMkernel, the console, etc.).
The guest scheduler might migrate a single-threaded workload amongst multiple vCPUs, thereby losing cache locality.

Adding to that, HA slot sizes should also be negatively affected causing problems and/or resource waste.

It's pretty clear to me that overcommitting vCPUs IS wasting resources no matter how you arrange them or how your workload is, overcomitting is overcommitting any way you spin it.

kdamak · ‎03-04-2010

Just out of curiousity. If you power off the VM and power it back on, does it resolve the problem?

You can also try divider=10 as a kernel paramter in the RHEL guest.