VMware Cloud Community
ekrejci
Enthusiast
Enthusiast

TPS not really working and Hardware-assisted MMU on AMD platform

Hi,

I have on my infra 4 different generation of AMD Opteron.

I noticed that VMs with CPU/MMU Virtualization set to automatic; the TPS is barely sharing pages, especially for Windows OS.

This happens on the 3 latest opteron models I have in the infrastructure:

On 2356 TPS -> OK

On 3431 TPS -> not ok

On 6172 TPS -> not ok

On 6272 TPS -> not ok

no TPS: VM-TPS-AMD-004.jpg

I checked on the BIOS and every hardware assisted technology for virtualization is enabled.

If you change the CPU/MMU Virtualization to “Intel VT-x/AMD-V for instruction set virtualization and software for MMU virtualization”, do a vMotion and boom, the page sharing goes up.

CPU/MMU Virtualization:

VM-TPS-AMD-005.jpg

resource allocation:

VM-TPS-AMD-006.jpg

Of course, using software MMU might be taken into consideration. Check the vSphere Resource Management guide (chapter 5) and the NUMA consideration.

Anyone has already met this behaviour?

thank you in advance

Eric

0 Kudos
4 Replies
Linjo
Leadership
Leadership

This is by design with recent processors and recent operating systems that supports Large Pages.

For more info have a look at this KB-article:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=102109...

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
ekrejci
Enthusiast
Enthusiast

Hi Linjo,

Thank you for your answer. I thought it was related to large pages.

But, for guests that aren't using a high memory workload, do you thing using software MMU that will break 2MB pages into 4Kb ones, and then using TPS might be worthy?

It might be relevant also in conjunction with NUMA. Using TPS thus reducing the physical memory usage, will help VMs staying in the same NUMA domain. This could be interesting, even without having the host memory not being overcommitted?

Eric

0 Kudos
Linjo
Leadership
Leadership

If I understand it correctly (I am not a performance engineer so I might be wrong) there will be a penaly breaking up the large pages to 4k pages.

You could easly try by running some benchmarks like VMmark.

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
ekrejci
Enthusiast
Enthusiast

indeed, the software MMU adds some overhead.

I found this paragraph in the vSphere 5.0 resource management guide (page 28)

Performance Considerations

When you use hardware assistance, you eliminate the overhead for software memory virtualization. In

particular, hardware assistance eliminates the overhead required to keep shadow page tables in

synchronization with guest page tables. However, the TLB miss latency when using hardware assistance is

significantly higher. As a result, whether or not a workload benefits by using hardware assistance primarily

depends on the overhead the memory virtualization causes when using software memory virtualization. If a

workload involves a small amount of page table activity (such as process creation, mapping the memory, or

context switches), software virtualization does not cause significant overhead. Conversely, workloads with a

large amount of page table activity are likely to benefit from hardware assistance.

so has you suggest I will go through some bench. but the usage of software MMU might be an interesting option with VMs having a small amount of page table activity, in order to use TPS.

thank you for your answers.

Eric

0 Kudos