VMware Cloud Community
Corvid
Contributor
Contributor

Core parking and CPU contention

My question is this. Does a parked vCPU core still need to be scheduled on the Host CPU?

I have Googled around a bit and I have certainly found quite a few differing opinions on whether or not it is better to disable core parking or leave it enabled, but I haven’t found an answer to this basic question.

I recently noticed on one of our Server 2012 VMs with several of its cores parked. The processor usage on this VM is variable. There are often times when the machine is not heavily used but when it is needed, it needs to be able to respond quickly and it needs all of its cores to do so. At times processor usage can go back and forth between 2% and over 90% quickly. During these time I notice the vCPUs parking and un-parking quite a bit and I know that, kind of activity can have an adverse effect on performance.

My first though was to simply shut off core parking in Windows to avoid performance problems at these times, especially since there is no power saving benefit for the VM, but then it occurred to me that if a parked core doesn’t need any resources from the Host machine leaving parking on could help with CPU contention on the host when the VM is idle. If, on the other hand, a parked vCPU still needs to be scheduled on the host CPU then there would be no advantage to leaving parking enabled and I should just set the power scheme on the VM to “High Performance” to shut off core parking.

So, does anyone know whether or not a vCPU core on a guest VM that is in a parked state still needs to have idle cycles scheduled on the host CPU?

Tags (2)
3 Replies
jayhvmw
Contributor
Contributor

The CPU scheduler doesn't know what the OS is doing.

If the VM has 10 cores, it gets 10 cores and that much more scheduling latency.

Always remove cores you don't need and turn off hot cpu and hot mem add.

This lets the scheduler respect NUMA boundaries and try to stay in real cores on a single proc (real cores on a single processor is the fastest).

if  you have 2 proc of 4 real cores and 4 threads with 64 gb --- for a large VM 1 proc x 4 cores with <32gb is the sweet spot for both power and speed.

Next closest, favoring power is 2x4 <32gb. Next closest favoring speed is 1x1 < 32gb (for single threaded).

Reply
0 Kudos
rampeter
Enthusiast
Enthusiast

What is CPU parking Hearing this concept first time

Reply
0 Kudos
jayhvmw
Contributor
Contributor

The OS can control individual cores using c-states.

If a core isn't being used it will have its cache flushed and be set to "0" Mhz to reduce power consumption.

On bare metal that is great if you want to save power. Not as great for performance but not bad.

On virtual machines it is not good because the VM doesn't own the core so there is jitter as it comes out of sleep for the next VM.

An example of parking is it is used as one of the ways to protect from the L1TF fault by parking the hyperthreads so they aren't used. VMware does this for ESXi in their new scheduler.

Another gotcha is the extra cores create scheduler latency but aren't getting used, also slowing you down. Further if you use more memory that is in that NUMA node, you face more slow down.

For performance and power, turn off hot add mem and cpu (either of these defeat NUMA and some scheduler optimizations), stay on real cores with a single socket when possible with less than the memory controlled by that cpu to stay inside the NUMA boundaries.

So for 2 sockets with 4 cores each and 4 threads each and 64gb the best performance and power is 1 socket, 4 cores (so the scheduler can keep it all in the same CPU without having to use threads) with <32gb so NUMA boundaries can be respected.