VMware Communities
dzamonorton
Contributor
Contributor
Jump to solution

VMware Fusion 7 and VMware Tools - Memory Leak in Debian Guest

At some point during my upgrades of OS X to Yosemite, VMware Fusion to 7.x and my Debian guest to 8.0 "Jessie" on my 2012 Mackbook Pro, a leak of Linux kernel memory developed.  If I followed these steps then I would soon lose about 3Gb of kernel area memory.

  1. Install Debian Jessie in a new VMware Fusion VM
  2. Install open-vm-tools or bundled VMware Tools.
  3. Perform some IO intensive operation like simply cat /dev/zero | head -c 1000000000 > /tmp/test.txt.
  4. Run smem -wtk and see ~3Gb of memory under kernel dynamic memory, Noncache.

I opened support request 15709682007 and the representative said they will fix the issue in a future release. In the meantime, I have what appears to be a workaround for anyone else who is affected.  Basically, pulling in packages from jessie-backports appears to have solved the problem.  This is based on the guest additions from open-vm-tools.

  1. Add Debian jessie-backports repository to /etc/apt/sources.list
  2. sudo apt-get update; sudo apt-get install -t jessie-backports open-vm-tools-dkms open-vm-tools-desktop
  3. sudo apt-get -t jessie-backports dist-upgrade (installs kernel 4.1)
  4. Restart guest
0 Kudos
1 Solution

Accepted Solutions
dariusd
VMware Employee
VMware Employee
Jump to solution

A few quick thoughts:

1. If you have many VMs (or other programs) running on your host, and VMware Fusion detects that there is memory pressure on your host, it will use the in-guest "balloon" driver (vmw_balloon for recent Linux distros) to steal some memory from the guest OS.  Such "stolen" memory would appear as non-cache kernel dynamic memory inside the guest.  Running "sudo rmmod vmw_balloon" should release any ballooned memory to the guest OS.  Note that the use of the balloon driver usually improves guest and host performance, despite the appearance of scary of "used memory" inside the guest -- the "used memory" is just an accounting trick that we need to use in order to allow us to reprioritize memory between the host and the guest.  It isn't "used" in the conventional sense.

2. I believe that recent virtual hardware versions will allocate non-cache kernel dynamic memory for SVGA and 3D features (display RAM, texture RAM, etc.).  Disabling 3D graphics acceleration or reducing the amount of shared graphics memory if you are using a new virtual hardware version (both in Virtual Machine > Settings... > Display) might help.  Downgrading to an earlier virtual hardware (in Virtual Machine > Settings... > Compatibility) might also help.

Cheers,

--

Darius

View solution in original post

0 Kudos
10 Replies
dzamonorton
Contributor
Contributor
Jump to solution

I've just seen that my workaround doesn't work.  Perhaps it buys some time until the leak occurs (perhaps not), but it does not prevent it.  Here is what my VM looks like now according to smem, time for a reboot :-(.

Area                       Used  Cache   
Noncache
firmware/hardware             0      0      0
kernel image                  0      0      0
kernel dynamic memory      4.5G   1.8G   2.7G
userspace memory           2.0G 294.6M   1.7G
free memory              602.7M 602.7M      0
0 Kudos
dariusd
VMware Employee
VMware Employee
Jump to solution

A few quick thoughts:

1. If you have many VMs (or other programs) running on your host, and VMware Fusion detects that there is memory pressure on your host, it will use the in-guest "balloon" driver (vmw_balloon for recent Linux distros) to steal some memory from the guest OS.  Such "stolen" memory would appear as non-cache kernel dynamic memory inside the guest.  Running "sudo rmmod vmw_balloon" should release any ballooned memory to the guest OS.  Note that the use of the balloon driver usually improves guest and host performance, despite the appearance of scary of "used memory" inside the guest -- the "used memory" is just an accounting trick that we need to use in order to allow us to reprioritize memory between the host and the guest.  It isn't "used" in the conventional sense.

2. I believe that recent virtual hardware versions will allocate non-cache kernel dynamic memory for SVGA and 3D features (display RAM, texture RAM, etc.).  Disabling 3D graphics acceleration or reducing the amount of shared graphics memory if you are using a new virtual hardware version (both in Virtual Machine > Settings... > Display) might help.  Downgrading to an earlier virtual hardware (in Virtual Machine > Settings... > Compatibility) might also help.

Cheers,

--

Darius

0 Kudos
dzamonorton
Contributor
Contributor
Jump to solution

Thank you Darius for this very helpful response, running sudo rmmod vmw_balloon did the trick. 


I can believe this ballooning feature is good in many contexts but I personally almost always run nothing but this single VM on my host.  Ballooning of 3Gb of guest memory was leading to heavy swapping in the guest and major slowdowns for me on some workloads.  So I'll be scripting sudo rmmod vmw_balloon into my guest (or disabling the kernel module some other more appropriate way).

0 Kudos
dariusd
VMware Employee
VMware Employee
Jump to solution

Glad I was able to help!

Out of curiosity, how much RAM is installed in your Mac, how much RAM is allocated to the VM?  If you have the time to run an experiment for me, I'd be interested to see the results if you could leave vmw_balloon loaded in the guest, reproduce the performance issue, and then run this command in Terminal on the host: "vm_stat".  If you could paste the results back into a reply here (along with the RAM sizes of your host and the VM), that'd be superb.

Cheers,

--

Darius

0 Kudos
dzamonorton
Contributor
Contributor
Jump to solution

8Gb of RAM in the Mac.  Currently 7000Mb allocated to the guest but I have experimented with lower guest memory allocation values.

I've run your experiment.  smem -wtk in the guest says

Area                       Used  Cache   
Noncache
firmware/hardware             0      0      0
kernel image                  0      0      0
kernel dynamic memory      4.5G   2.1G   2.4G
userspace memory         974.1M 202.0M 772.1M
free memory                1.2G   1.2G      0

vm_stat in the host says

Mach Virtual Memory Statistics: (page size of 4096 bytes)

Pages free:                                3929.

Pages active:                            345625.

Pages inactive:                          329596.

Pages speculative:                         6077.

Pages throttled:                              0.

Pages wired down:                       1335316.

Pages purgeable:                          27075.

"Translation faults":                  20004482.

Pages copy-on-write:                     220435.

Pages zero filled:                      3398336.

Pages reactivated:                       578318.

Pages purged:                            761240.

File-backed pages:                       552431.

Anonymous pages:                         128867.

Pages stored in compressor:              284421.

Pages occupied by compressor:             75907.

Decompressions:                          704325.

Compressions:                           1273631.

Pageins:                                5365743.

Pageouts:                                721390.

Swapins:                                  47560.

Swapouts:                                 80962.

0 Kudos
dzamonorton
Contributor
Contributor
Jump to solution

It now looks like I was the architect of my own difficulties.  Having backed off from the very aggressive allocation of 7000Mb out of 8000Mb to the guest, memory ballooning seems to be gone or unnoticeable.

0 Kudos
dariusd
VMware Employee
VMware Employee
Jump to solution

Yeah, leaving only 1 GByte for an entire OS X host and its applications is probably going to lead to trouble.

Another interesting thing to try: If you currently have the guest OS type (in Virtual Machine > Settings... > General) set to Debian (or Debian 64-bit), try changing it to Ubuntu (or correspondingly Ubuntu 64-bit if appropriate)... It looks like we assume that Debian won't use/need as much RAM as Ubuntu, so we allow the balloon to grow more aggressively on Debian VMs.  As it stands now, it may well be too aggressive for a modern Debian guest, enough that I could imagine it could potentially cause performance problems.  Changing the guest OS type from Debian to Ubuntu will make the balloon a smidgen less aggressive.  If you're up for another experiment, give that a try and see if the performance problems are somewhat alleviated.

Cheers,

--

Darius

0 Kudos
dzamonorton
Contributor
Contributor
Jump to solution

I changed the guest OS type to Ubuntu and gave it 7000Mb again just to try to make trouble.  I haven't been able to get the kernel dynamic memory noncache total above ~230Mb in spite of determined usage.  Just subjectively it felt like that same figure hovered around ~330Mb when guest type was Debian and before ballooning kicked in.  So this is great overall.

0 Kudos
martindorey
Contributor
Contributor
Jump to solution

Changing the guest type, which had become horribly out of date (Debian 4 (32-bit) when it was really Debian 9 (64-bit)) didn't help me (even to Ubuntu Linux (64-bit)), but:

    sudo rmmod vmw_balloon

... how I wish I'd known that, or even just the clue that was:

    sudo vmware-toolbox-cmd stat balloon

... a week ago.  I spent ages struggling to understand how there could be jiggabytes of memory reported with zero flags by:

<https://github.com/torvalds/linux/blob/master/tools/vm/page-types.c>

user@machine:/var/tmp/D145677$ sudo ./page-types
             flags        page-count       MB  symbolic-flags                        long-symbolic-flags
0x0000000000000000           2774269    10836  ___________________________________________

I don't know why just one of our VMs is afflicted, and only, seemingly, after we increased the RAM from 1 GiB to 2 GiB (then 4 GiB, 8 GiB, 16 GiB, desperately trying to stave off this leak).  As with the OP, perform io, as little as reading 0.5 GiB from /dev/sda and throwing it away, and VMware would start ballooning 10 GiB out of the 16 GiB of RAM away from the guest, never to give it back, even though it drove Linux to grind almost to a halt.

It ground so badly, indeed, that it wouldn't even recover fully after the rmmod freed the memory.

    { echo '# D145677'; echo blacklist vmw_balloon; } | sudo tee /etc/modprobe.d/vmw_balloon-blacklist.conf

... and a reboot was needed before we were really off to the races again.

I hope this post gets the next poor sap some better Google hits.  We'd tried upgrading with Debian Jessie, upgrading to Debian Stretch, installing open-vm-tools, removing them again, upgrading them to Stretch-Backports, cloning the VM, recompiling our own kernel (the upstream default config, with the Fusion MPT ScsiHost drivers for SPI needed for VMware, didn't suffer but Debian's config, with ballooning support did), adding kmemleak checking (which showed nothing).  Reporting of ballooned memory in /proc/meminfo would have been so useful.

https://serverfault.com/questions/780098/why-doesnt-memory-consumed-by-balloon-driver-not-show-up-in...

Yes, good question.

0 Kudos
martindorey
Contributor
Contributor
Jump to solution

Edit Settings, Memory, click the > to expand, Limit was 1024 "MB".  Groan!

0 Kudos