VMware Cloud Community
RanjnaAggarwal
VMware Employee
VMware Employee

HT with NUMA

What is the recommendation for HT with NUMA to keep the HT disabled or enabled?

Regards, Ranjna Aggarwal
0 Kudos
8 Replies
JCMorrissey
Expert
Expert

There's nothing indicated in the vsphere best practices guide to indicate that

HT needs to be disabled, if you review pg 22 of the guide http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf

it does menton a scenario which requires tweaking the

vcpu.preferHT flag but it addresses a v specific circumstance

Please consider marking as "helpful", if you find this post useful. Thanks!... http://johncmorrissey.wordpress.com/
0 Kudos
Ethan44
Enthusiast
Enthusiast

Hi

Welcome to the communities.

I will go with Disable the HT option .

"a journey of a thousand miles starts  with a single step."
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

I was also thinking of that?

Regards, Ranjna Aggarwal
0 Kudos
mcowger
Immortal
Immortal

Why would you disable HT?  All the benchmarks show that modern HT, for 95% of workloads, improves performance...

--Matt VCDX #52 blog.cowger.us
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

Because numa is not considering HT option.

Regards, Ranjna Aggarwal
0 Kudos
JarryG
Expert
Expert

Just to make things clear, I suppose with NUMA you mean "non uniform memory access" and with HT "hyperthreading". If it is so, then what has NUMA to do with HT? From NUMA point of view, it does not matter if real core (cpu) or "hyperthreaded-core" needs data from memory. The only thing which *does* matter: is memory-page requested in "local" (directly accessible) or "non-local" (indirectly accessible) memory bank?

HT can have (and probably has) some impact on frequency of "cache misses" (2 threads still share the same amount of cpu-cache) and this might increase number of memory-pages requested, but this effect is very small and outweighed by benefits of running 2 threads in parallel.

Or I put it other way: if you do not have problems with HT on UMA (unified memory architecture), you will very probably do not see problems even on NUMA (and vice-versa)...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

Then what is the meaning of this:-

During placement of a vSMP virtual machine, the NUMA load balancer assigns a single vCPU per CPU core and “ignores” the availability of SMT threads

Regards, Ranjna Aggarwal
0 Kudos
JarryG
Expert
Expert

It does not matter which thread of the same core (or cpu) places request on data from memory, because both threads of the same core have the same affinity to particular memory-bank, share the same L1/2/3 cache, etc. From NUMA point of view they are equal. That's the reason why NUMA does not need to have list of "cpu-threads", it is enough to keep list of cores.

You might ask "why is it then not enough to keep just list of cpus"? Because cores inside  of the same cpu might be organised in complex way, clustered, or multi-layered, sharing or not sharing common cache, etc. (i.e. "bulldozer" microarchitecture, 8 cores in 4 clustered modules). Then 2 cores of the same cpu-cluster are for NUMA not the same as 2 cores from different cpu-clusters (but from the same physical cpu)...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
0 Kudos