VMware Cloud Community
tickermcse76
Contributor
Contributor

Multi-CPU / High Contention VM cluster - better to enable or disable HyperThreading ( HT )

I believe all VMware documentation recommends the use of HyperThreading, including their whitepapers specific to MS SQL and Oracle database servers.  In an environment that I support which includes several compute intensive 4-12 vCPU web and database servers, we see occasional high CPU ready times and lots of performance variance.  Some of those results may be based on over subscription based on desired results (anything short of 1:1 ratio is always subject to delays).

But I can't help to think some of these issues are due to HT being enabled and servers contending with themselves on 2 HT'ed cores.  There are various personal blogs that suggest setting processor affinity rules to ensure a particular VM's vCPU's all land on separate physical cores.  At that point why not just disable HT completely?  Has anyone taken this approach, and if so can you share your feedback?  In this particular case the compute intensive VM's would be isolated into their own cluster; while VM's that are more friendly to resource sharing would remain in an HT-enabled cluster.

0 Kudos
5 Replies
MKguy
Virtuoso
Virtuoso

I don't know why the myth around disabling HT still hasn't died these days. Intel Pentium Netburst based HT from 10 years ago wasn't the best thing ever and in rare cases caused problems, but the HT introduced with the Nehalem architecture 8 years ago is different.

If you're already experiencing high %RDY time and CPU contention, then disabling HT will give you even fewer scheduling opportunities and less efficient CPU pipeline parallelization, which means it will just drive up your CPU %RDY even more.

Give it a try, disable HT on a node, put it under the same load pattern and vCPU count and I'm sure you will see higher %RDY.

CPU affinity rules will not really help you and only make sense if you have enough resources (which apparently is not the case) or are fine with sacrificing some of the other VMs performance in favor for a few.

Here's also another common misconception around HT which seems like you're basing some reasoning on: HT (Nehalem+) does not work like "1 full fast core and 1 slow side thread". It is just one physical core that presents itself as 2 equal threads in order to efficiently utilize the core should one thread be waiting for IO or such things. By default ESXi already places each VM vCPU on a different physical core and will not put 2 vCPUs of the same VM on one physical core (with 2 threads), that is unless you use silly things like CPU affinity or the numa.preferHT setting when the vCPU count of a VM exceeds the physical core count on a single NUMA node. \

Here is a general performance whitepaper which clearly advises to enable HT whenever possible:

https://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.5.pdf

In regards to your %RDY issue, how much %RDY on a VM with how many vCPUs are we talking here? Also make sure you set the physical host power management to maximum performance or OS control since many server default power saving settings can introduce additional delay.

-- http://alpacapowered.wordpress.com
0 Kudos
tickermcse76
Contributor
Contributor

HT cores are equal, however only one core can execute at any given point in time and they need to take turns (when this happens CPU ready time is increased).  The resource scheduler will only execute on separate pCPU if available.  If not available it will take whatever it can get (this can be avoided by disabling HT).  If the resource scheduler always waited for enough pCPU's to be available for every vCPU  I would imagine a great many VM clusters would almost come to a halt.

0 Kudos
cianfa72
Enthusiast
Enthusiast

MKguy said:

"By default ESXi already places each VM vCPU on a different physical core and will not put 2 vCPUs of the same VM on one physical core (with 2 threads), that is unless you use silly things like CPU affinity or the numa.preferHT setting when the vCPU count of a VM exceeds the physical core count on a single NUMA node"

here my doubt is: when hyperthreading mode is set to "internal" for VM properties->advanced CPU->Hyperthreaded core sharing, how does the system actually behave? Default vCPU placement is overridden and ESXi can just put vCPU belonging to the same VM on different threads (HT) of the same physical core ?

0 Kudos
cianfa72
Enthusiast
Enthusiast

Help !

0 Kudos
MKguy
Virtuoso
Virtuoso

here my doubt is: when hyperthreading mode is set to "internal" for VM properties->advanced CPU->Hyperthreaded core sharing, how does the system actually behave? Default vCPU placement is overridden and ESXi can just put vCPU belonging to the same VM on different threads (HT) of the same physical core ?

Yes, in this case the CPU scheduler can place 2 vCPUs from the same VM on the same physical core. This is one of the rare exceptions I meant and not a default setting.

To quote the Resource Management Guide:

https://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-55-resour...

Internal

This option is similar to none. Virtual CPUs from this virtual machine cannot share cores with virtual

CPUs from other virtual machines. They can share cores with the other virtual CPUs from the same

virtual machine.

-- http://alpacapowered.wordpress.com
0 Kudos