VMware Cloud Community
ehayes
Contributor
Contributor

Forecasting CPU Loads

Part two of a two part post I figured I put my cpu related questions in this thread to solicit feedback on my thoughts.

I'm using the API to pull back certain data from virtual center for capacity planning purposes. My goal is to be able to establish/forecast when to add additional hardware resources to this environment (add physical memory, upgrade procs, or simply scale servers).

Metric collected at a cluster level:

cpu.usagemhz.average This will show me aggregate of CPU usage in MHz of all running virtual machines.

Data examples from one of my environments:

VM Break Down = 175 VMs / 8 Hosts = ~22 VMs Per host (Not quit that evenly balanced but you get the point)

Out of the 175 there are some multiple vCPU(SMP) Machines

2 - 4vCPU VMs

32 - 2vCPU

141 - 1vCPU

8 Host x 8 cores x 2.6GHz = 166400 MHz of processor power

cpu.usagemhz.average = 56114 MHz / 43%

Sound easy, at least to me....... maybe set a threshold at 70-80%, and then add physical resources.

Well not quit...the 43% number was at 30%, then we added ~15 new VMs (RHEL 5 64-bit ALL 2vCPU machines). So with the new VMs the usagemhz is up to 43%. I still think we should have room.

In the back of my mind I'm think I with my hosts had more cores to allow better scheduling of all these vSMP machines. So finally the Sys Admins of the VMs are getting ready to actually use them and they are extracting tar.gz and I get a call saying there vms are starting to hang / loose responsiveness. All the unzips complete and the vms move along but are very slow.

So, what do I do......I look at virtual center (Real-Time) and look at CPU Ready of the VMs in questions are having 3000 - 5000ms ready times. Which is really high.

Next, since I have Level 3 stats enabled I switch the (Past Day) view and see

CPU Ready Latest = 4081ms

CPU Ready Maximum 27393ms (wow...is that 27 seconds!!!!!)

CPU Ready Minimum 3172ms

CPU Ready Average 8075ms

Back to the problem at hand how to forecast when to add resources. I can't just simply look at the usagemhz because of the scheduling issues. Anyways how would you guys plan cpu loads?

0 Kudos
1 Reply
admin
Immortal
Immortal

Average CPU usage alone doesn't give you complete picture of CPU contention.

This is partly because the usage doesn't necessarily equal the peak usage.

If it is averaged with a period when most VMs are idle, the average usage looks lower.

Please consider max usage as well. If the max usage reaches to the peak capacity,

you may want to take some actions.

Ready time is also a good indicator. I've seen people saying 20% or higher ready time

may indicate trouble, but it really depends on workload and your environment.

Based on what you described, the ready time is really high. I'd check if the host CPU

is completely saturated when that happened.

Are you setting CPU affinities? Are you setting CPU limit to the VM or resource pool?

Hope this helps.

Thanks,

-sb

0 Kudos