Solved: NUMA confusion

kwg66 · ‎02-05-2018

I have a vSphere host with 2 sockets, 12 cores per socket.

VM is w2012 standard

VM is given 8 vCPUs, (8 sockets, 1 core per socket in the VM CPU config, as recommended by Vmware).

inside the guest OS the resource monitor shows 1 NUMA node

We updated the vCPUs to 10, now inside the guest OS resource monitor is now showing 2 NUMA nodes???

Why did this adjustment in the guest os occur? and is it even relevant given its within the guest OS, and the by default, NUMA is enabled on the VMware side by default when the VM is given 8 vCPUs.

Now, I read that some 12 core hardware actually can have NUMA nodes of 6 cores, but even if this were true with my intel CPUs, it doesn't make any sense that I would see this change in the guest OS when moving from 8 vCPUs to 10.

According to everything I read and absorbed about VMware NUMA, as long as the # of vCPUs is less than the # of cores on the socket, VMware will keep the VM's threads within that NUMA node, or at least make a best effort to do so.

What's up with the w2012 guest os changing to show 2 numa nodes instead of 1?

daphnissov · ‎02-05-2018

We updated the vCPUs to 10, now inside the guest OS resource monitor is now showing 2 NUMA nodes???

Because when you create a "wide" VM (a VM with sockets over cores) and go above 8 vCPU, the vNUMA topology of the host gets exposed to the virtual machine. This is correct and intended behavior.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

View solution in original post

daphnissov · ‎02-05-2018

We updated the vCPUs to 10, now inside the guest OS resource monitor is now showing 2 NUMA nodes???

Because when you create a "wide" VM (a VM with sockets over cores) and go above 8 vCPU, the vNUMA topology of the host gets exposed to the virtual machine. This is correct and intended behavior.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-05-2018

But how is this become a wide VM? we have 12 cores per socket... there was only 8 vCPUs, now 10.

daphnissov · ‎02-05-2018

I'm talking about the VM's configuration, not your host. There's no guarantee that a VM configured with number of vCPUs less than or equal to the physical core count of a package will fit into one NUMA node. That's an assumption you're making which isn't true. When you configure a VM starting with 9 vCPUs, the NUMA topology gets passed through to the VM. In vSphere 6.5, a change was made that exposes this regardless of cores per socket. You may want to read this article to give you a bit more insight into this and recent changes to vNUMA.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-05-2018

OK, I see one of my mistakes, I thought NUMA was for 8 vCPUs and up, but it is actually just for over 8 vCPUs... 9 and up..

Also, in the links located in the article you sent me to look at, there is a MS tech article explaining the soft numa kicks in with SQL and automatically creates 2 soft NUMA nodes, and that explains why we see 2 NUMA nodes in the guest OS.

However, to optimize the VM, don't you want to keep it on a single NUMA node? This is my understanding of it per many posts I have looked at, meaning that you want try to keep the # of vCPUs allocated less than the amount of cores on the physical CPU socket? unless the # of vCPUs is greater than the cores in the NUMA node, then you would want to allocate in proportion to the # of cores in the numa node..... https://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.5.pdf

If yes, and we have 12 cores in our NUMA node, then wouldn't it be advantageous to keep the VM at 12 vCPUs? But what about turning off soft numa in the guest OS? is this desired given we have 12 cores in a numa node and don't need more this, but are stuck with sql soft numa splitting it up between 2 nodes?

This scenario is more complex than what most bloggers are indicating..

daphnissov · ‎02-05-2018

What I think would be good is for you to check out VMware vSphere 6.5 Host Resources Deep-Dive book as it has a ton of great information on v/NUMA that should answer all your questions. Thanks to Rubrik, it is being offered free of charge here. Highly, highly recommended book.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-07-2018

Hi Daphnissov - sadly I'm back, but its because of the fact that the VM having performance troubles has an allocation of 8 vCPUs, but the host only has 6 cores per socket, with hyperthreading 12 logical cores... and by the logic of 8 vCPUs exceeding the 6 physical cores per socket in my host, then NUMA should have presented itself to the guest OS, but it did not, in the w2012 guest OS there is only 1 NUMA node being presented (which is good)

I was under the impression that when documentation refers to a core on a physical system they are talking about a full physical core and not a logical hyperthreaded core, but based on this, and I"m talking v6.02, not 6.5, that this VM should have become wide VM, but inside the guest OS there is only 1 NUMA node presented to the guest OS.

Based on what I'm seeing in my w2012 guest OS, the documentation below must be referring to logical cores, not physical.. is this correct? Or is there some other factor in play that I could be missing?

See here is vSPhere 6 Best Practices:

daphnissov · ‎02-07-2018

The passage you've quoted is how ESXi works with NUMA, which is correct, but in your case you're not seeing how virtual machines interact with NUMA from the underlying host. From the vSphere Resource Management guide for vSphere 6.0 U1, pp. 110-11:

Using Virtual NUMA
vSphere 5.0 and later includes support for exposing virtual NUMA topology to guest operating systems,
which can improve performance by facilitating guest operating system and application NUMA
optimizations.
Virtual NUMA topology is available to hardware version 8 virtual machines and is enabled by default when
the number of virtual CPUs is greater than eight. You can also manually influence virtual NUMA topology
using advanced configuration options.
You can affect the virtual NUMA topology with two settings in the vSphere Web Client: number of virtual
sockets and number of cores per socket for a virtual machine. If the number of cores per socket
(cpuid.coresPerSocket) is greater than one, and the number of virtual cores in the virtual machine is greater
than 8, the virtual NUMA node size matches the virtual socket size. If the number of cores per socket is less
than or equal to one, virtual NUMA nodes are created to match the topology of the first physical host where
the virtual machine is powered on.
When the number of virtual CPUs and the amount of memory used grow proportionately, you can use the
default values. For virtual machines that consume a disproportionally large amount of memory, you can
override the default values in one of the following ways:
Increase the number of virtual CPUs, even if this number of virtual CPUs is not used. See “Change theNumber of Virtual CPUs,” on page 111.
Use advanced options to control virtual NUMA topology and its mapping over physical NUMA topology. See “Virtual NUMA Controls,” on page 111.

Also, unless you change the behavior, NUMA will only consider physical cores and not logical cores via HT. If you wish that not to be the case, you will need to use the advanced option numa.vcpu.preferHT = TRUE.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-07-2018

Right, so in my case the # of virtual cores per socket is only 1, the the total socket count is only 8. and we don't have the advanced parameter set up numa.vcpu.preferHT = TRUE.

But there is no mention of it being enabled with the "wide VM" scenario.. as in other documentation I've seen...

I was assuming the VM was a "wide VM" because the 8 vCPUs exceeds the 6 physical cores, but I guess this isn't the case.

So based on this logic, I see only 1 NUMA node in the guest OS, which means that all 8 vCPUs are executing in the 1 NUMA node that has 6 full cores, 12 logical.. with HT enabled, each vCPU consumes a logical thread, so there is still room for it to run in the NUMA node, but the old "2 lanes going through the toll both" scenario arises in this case. Which could be hampering the performance a bit.

I have an adjacent cluster with a 12 core per socket host, 24 logical. I believe it would be advantageous to move this VM over there to run the tasks required.

daphnissov · ‎02-07-2018

I was assuming the VM was a "wide VM" because the 8 vCPUs exceeds the 6 physical cores, but I guess this isn't the case.

It's considered "wide" because you're preferring sockets over cores, and when you do that you allow ESXi to place the VM into the most appropriate NUMA nodes for its configuration.

So based on this logic, I see only 1 NUMA node in the guest OS, which means that all 8 vCPUs are executing in the 1 NUMA node that has 6 full cores, 12 logical.. with HT enabled, each vCPU consumes a logical thread, so there is still room for it to run in the NUMA node, but the old "2 lanes going through the toll both" scenario arises in this case. Which could be hampering the performance a bit.

No, this isn't the conclusion. It is the case where your 8 vCPU VM is spanned across multiple NUMA nodes, but because you have not configured the 9th vCPU, that topology isn't exposed to the VM and so therefore only presents a single NUMA node.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-07-2018

Part of my understanding of this comes from Frank D's blog:

the text underlined above is what leads me to believe I should see 2 NUMA nodes in my guest OS, but I believe the follie of my thinking is that I presumed vSPhere would present 2 NUMA nodes when in fact, after vNUMA was exposed to the guest as it should based on what is underlined, that vSPhere made the determination to present just a single NUMA node to the guest...

If this last post doesn't close this loop on this I might as well take the rest of the day off and get some sleep hahaha...

daphnissov · ‎02-07-2018

No, that's correct, but those two conditions are AND conditions and not OR. In other words, both of those must be met in order for the NUMA topology to be exposed.

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

kwg66 · ‎02-07-2018

yep, I need sleep! damn I didn't pick up on that. The devil is always in the subtlety of language.. so vNUMA is not exposed the VM in my case.

I will point out, to my defense of confusion, that in reviewing this information, I came across 3 different interpretations of what a wide VM is:

1) a VM with more vCPUs than the # of physical cores in the NUMA node

2) a VM that spans more than 1 NUMA node

3) a VM whereby its allocation is sockets over cores, as in 8 sockets \ 1 core per socket

I should have just tested out all the different options with a guest OS so I can see the behavior in front of me..

All

NUMA confusion