mcwill
Expert
Expert

ESX4 + Nehalem Host + vMMU = Broken TPS !

Jump to solution

Since upgrading our 2 host lab environment from 3.5 to 4.0 we are seeing poor Transparent Page Sharing performance on our new Nehalem based HP ML350 G6 host.

Host A : ML350 G6 - 1 x Intel E5504, 18GB RAM

Host B : Whitebox - 2 x Intel 5130, 8GB RAM

Under ESX 3.5 TPS worked correctly on both hosts, but on ESX 4.0 only the older Intel 5130 based host appears to be able to scavenge inactive memory from the VMs.

To test this out I created a new VM from an existing Win2k3 system disk. (Just to ensure it wasn't an old option in the .vmx file that was causing the issue.) The VM was configured as hardware type 7 and was installed with the latest tools from the 4.0 release.

During the test the VM was idle and reporting only 156MB of the 768MB as in use. The VM was vmotioned between the two hosts and as can be seen from the attached performance graph there is a very big difference in active memory usage.

I've also come across an article by Duncan Epping at yellow-bricks.com that may point the cause as being vMMU...

MMU article

If vMMU is turned off in the VM settings and the VM restarted then TPS operates as expected on both hosts. (See second image)

So if it comes down to chosing between the two, would you choose TPU over vMMU or vice versa?

0 Kudos
123 Replies
admin
Immortal
Immortal

>>You talk of nesting page tables which I assume means s/w MMU is still running.

Nope Nested paging is hardware MMU feature. Page table structures are maintained by software but hardware (MMU unit) fetches the information does the page table walk to fetch the information from the page table structure (or fetches it from TLB cache if it is already in the cache). Software MMU does not use nested page tables, instead it uses shadow pagetables and the hardware directly walks the shadow pagetables (and there is no additional cost for TLB misses)

>>is there any benefit in running hardware MMU with small pages?

Oh yes absolutely.

>>The "Large Page Performance" article talks of conguring the guest to use Large Pages, this isn't something I had done for any of our guests. Do the tools handle that now or is it still a required step?

There are different levels of large page support. Applications inside the guest can request the OS for large pages and OS can assign large pages if it has contiguous free memory. But OS mapped large pages (i.e Physical pages) may or may not be backed up by the hypervisor by actual large pages (machine pages). For instance guest may think it has 2M chunks but hypervisor may use 4K chunks to map the 2M chunk to the guest, in this case multiple 4K pages have to be accessed by the hypervisor so there is no performance benefit even though guest uses large pages. In ESX 3.5 we introduced the support for large pages in the hypervisor, with this support whenever the guest tries to back up large pages we explicitly go and find large pages to backup the guest large pages. This helps in performance as demonstrated by the whitepaper.

In addition to this, on NPT/EPT machines hypervisor also opportunistically tries to backup all guest pages (small or large) as large pages. For instance even if the guest is not mapping large pages, contiguous 4K regions of the guest can be mapped by a single large page by the hypervisor, this helps in performance ( by reducing TLB misses). This is the reason why you see the use of large pages even though the guest is not explicitly requesting for it.

>>I have to disagree as to whether this is a bug or not. Large pages make 80% of our VMs alarm due to excess memory usage, and negate one of the main differences between ESX and Hyper-V.

There are two issues here. 1) TPS not doing sufficient memory savings when large pages being used - this is not a bug. It is a tradeoff choice that you have to make. The workaround I suggested will help you to choose which tradeoff you want to make on NPT/EPT hardware. The second issue is 2) VM memory alarm - this is a separate issue and it is not dependent on page sharing. VM memory usage alarms turns red whenever the guest active memory usage goes high. Guest active memory is an estimated through random statistical sampling and the algorithm that the hypervisor uses to estimate active memory usage of a VM overestimates active memory when the guest small pages are backed up large pages (since active memory estimate is done with reference to machine pages) and this is a bug. For now you could simply ignore this alarm (since it is a false alarm), I was told that we will be fixing this pretty soon. However note that this will only fix the alarm, the memory usage of the VM will still remain the same.

View solution in original post

0 Kudos
mcwill
Expert
Expert

Thank you for taking the time to produce such a detailed response, it has certainly helped my understanding of what is and isn't happening.

Regards,

Iain

0 Kudos
Rajesh_Venkatas
Contributor
Contributor

Thanks Kichaonline for accurate information. A small correction -- we are currently investigating ways to fix the high memory usage issue also. Regarding TPS, as noted earlier this shoud not lead to any performance degradation. When a 2M guest memory region is backed with a machine large page, VMkernel installs page sharing hints for the 512 small (4K) pages in the region. If the system gets overcommitted at a later point, the machine large page will be broken into small pages and previously installed page sharing hints helps to quickly share the broken down small pages. So low TPS numbers when a system is undercommitted does not mean that we won't reap benefits out of TPS when machine gets overcommitted. Thanks.

0 Kudos
tonybunce
Contributor
Contributor

We are seeing the same issue on our ESX Servers that have been upgraded from 3.5 to 4.0. After the upgrade most of our VMs are in alarm for excessive memory but inside the guest OS there is lots of free memory. For example most of our linux servers (CentOS x64) show 1.5GB free inside the guest but Infrastructure client is reporting it is using a full 2.0GB. If i shutdown/reboot the VM it will go down to 1.0-1.5GB of used memory but eventually will creep up but never go down.

We are using quadcore Xeon 5500s with 48GB of ram.

If 3.5 couldn't do hardware MMU with these processors would disabling hardware MMU give us the same performace we were seeing in 3.5? Or are we better off using hardware MMU and setting Mem.AllocGuestLargePage to 0?

It sounds like this may be a seperate issue that TPS, it seams as if ESX4 isn't reclaiming free memory that the Guest isn't actively using.

Rajesh do you have any details regarding the high memory issue that you said is being investigated?

0 Kudos
admin
Immortal
Immortal

If Im not mistaken Intel Xeon 5500s is Nehalem processor so yes you will notice high memory usage alarm in 4.0 and not in ESX 3.5. .As I mentioned earlier you could ignore this alarm as it is a false positive (happens only when large pages is used) and this issue will be fixed pretty soon.

>>are we better off using hardware MMU and setting Mem.AllocGuestLargePage to 0?

You should always use hardware MMU whenever possible. Setting Mem.AllocGuestLargePage to Zero is a workaround to get instant TPS benefits but also as a side effect it will fix the alarm problem (since large pages gets disabled with this option).

>> seams as if ESX4 isn't reclaiming free memory that the Guest isn't actively using.

That is not correct. ESX reclaims unused (but previously allocated) memory through ballooning and through TPS. The problem is when large pages is used, TPS doesnt kick in instantly, so you woulndt get the instant gratification of noticing TPS memory savings. However when your system is under memory over-commitment, vmkernel memory scheduler will break large pages into small pages transparently and it will collapse it with other shareable pages - this feature called "share before swap" is new to ESX 4.0.So you would still get the same benefits for TPS but only at the time of memory over-commitment.

To summarrize, when you use ESX 4.0 on Nehalem/Barcelona/Shanghai (EPT/NPT) systems,

a) ignore the high memory usage alarm - this will be fixed pretty soon

b) dont worry about TPS - it will kick in automatically when your system is under memory over-commitment

0 Kudos
admin
Immortal
Immortal

Just one more item to the summary

c) Use Mem.AllocGuestLargePage=0 workaround only if you need instant TPS sharing benefits (you will however need to vmotion the VM in and out of the system for the changes to take effect)

I will also draft a KB article and publish it soon for the benefit of others.

0 Kudos
tonybunce
Contributor
Contributor

Thanks for the summary, it was very helpful.

0 Kudos
MDOmnis
Contributor
Contributor

I've raised SR #1401115468 about this same issue with a Dell R710 seeing high guest memory usage reported in both vCenter and when directly connected to the host with the vSphere client. Seems to affect both 2003 and 2008 guests.

Is there anything official from VMWare out there on this yet?

0 Kudos
steveanderson3
Contributor
Contributor

I'm seeing this same issue now as well. We have a start-up environment, one host is a DL380 G5 running ESX 3.5 u4, on hist is a DL380 G6 5540 running esxi 4. I'm seeing the high active memory issue on the 2008 and 2003 server running on the esxi 4 host. I'm hoping to hear of a fix soon as well, we are about to start rolling out a large number of DL380 G6's.

0 Kudos
Aesthetic
Contributor
Contributor

Just chiming in - I'm seeing the same problem on my Dell R710s. I've got an open support case for this with VMware. Hopefully we'll see a bug fix shortly, though this is my first "bug" experience with VMware so unsure how soon is realistic...

0 Kudos
freez267
Contributor
Contributor

I have the same problem in my DELL PE1950 with intel 5400.

I have just updated from 3.5u4 to 4.0 and the memory usage is too high.

0 Kudos
Aesthetic
Contributor
Contributor

What kind of processors are in those PE1950s? The issue we're saying I thought was fairly certain to be specific to Nehalem hosts...

0 Kudos
freez267
Contributor
Contributor

I know that the issue is for nehalem host but I'm not the only one that have this problem with a different processor (see the other posts).

My processor is Quad-Core Xeon E5410 2,33 ghz

0 Kudos
Aesthetic
Contributor
Contributor

K. I haven't seen any other posts re: other processors so I'll take your word for it. I thought this specific issue was isolated to the Nehalems...

0 Kudos
LarsEber
Contributor
Contributor

I am having this issue and I have AMD hosts. For me, changing the advanced setting then vmotioning off and back on did not fix the issue. The VM still is in alarm for memory usage (80+%) , when the guest OS reports much lower usage (25%).

0 Kudos
Charadeur
Contributor
Contributor

I am seeing the same issue on quad core xeon processors. These VMs ran fine on ESX 3.5. Changing the advanced setting and vmotioning did not help for us either. A fix would be nice.

0 Kudos
joergriether
Hot Shot
Hot Shot

this one is eating my nerves!!! a simple fix just for the incorrect guest mem reporting problem states 97% all the time would be the the least i expected for the first update interval. but it is NOT fixed!!!!!

just installed a fresh w2008x64 with 4 gig vmem onto a dell710 (nehalem 5520), insiode guest says it consumes 1.2Gig, vsphere client says MARK, ALARM and 97%.

vmware, when will this be fixed?

best,

Joerg

0 Kudos
jcrivello
Contributor
Contributor

I am having the same problem in a non-production environment. I have three ESXi 4.0 hosts that were very recently on 3.5. The processor/machine types are listed below:

1) Intel Xeon E5405 (Dell PowerEdge 2900)

2) Intel Xeon E5410 (Dell PowerEdge 2950)

3) AMD Operton 2382 (Dell PowerEdge 2970)

Interestingly, I have no memory problems with (1) or (2). However, I am observing a problem on (3) and ONLY with 64-bit guests. I have a couple of 32-bit guests that are at 5-13% guest memory utilization but all of the 64-bit guests are at 81-97%. I should also mention that I attempted the advanced settings workaround suggested in this thread to no avail. I also tried toggling the MMU options on guests to the three available options but to no avail.

Furthermore, I should make it clear that it's not just the guest memory that is being reported as high on these guests, it's the consumed host memory. On literally all of the 64-bit machines the amount of consumed host memory is near the limit of what it could be.

This is a pretty serious problem for us.

0 Kudos
Aesthetic
Contributor
Contributor

The bad news - support and other reps are all implying that the intentions are likely to not confront this issue until the first update interval (i.e. no stand-alone "patch").

The (maybe) good news - support also said the first update interval is likely "sooner than you think". Granted, that's likely just to keep me quiet and patient, BUT, here's hoping... 😕

This is actually causing a huge problem for us. This is our first 'major' virtualization

initiative. While all has been going extremely smoothly and actual

performance is great, this bug is causing a lot of uncertainty

and nervousness with management who will not allow the virtualization

of our "core" servers until this bug is corrected. It's reducing their faith in the product. I actually had one VP asking my yesterday about, "That Microsoft product..."

0 Kudos
AlokGupta
Contributor
Contributor

We are aware of this issue and it will be fixed in the first update.

0 Kudos