Adding to the Workstation/Player 16 woes, including a failure of Wayland support, a failure to compile, a failure to start, and a failure to launch any guest, we now can add a failure of memory management.
Specifically, on a Ubuntu 22.04 host, a Windows 10 guest freezes when, e.g., a database application (e.g., Microsoft Access .accdb) is opened or video is attempted to run in a browser (specifically, Firefox). This behavior also will occur without either of the foregoiing, but with as few as seven tabs open in Firefox with static content.
Running # top on the host reveals that the process kcompactd is pinning one CPU core at 100% while the process vmware-vmx is running at 280%-290%.
This is notwithstanding various configuration changes recommended in sundry posts across the Internet to address transparent hugepages, memory compaction, video demands, and guest behavior, including:
Again, notwithstanding all of the foregoing, which took a half day to implement, a Windows 10 guest nevertheless continues to freeze under trivial memory demands.
Windows 10 guests running in VirtualBox are reported not to exhibit this behavior. I currently am evaluating this and a similar guest running in KVM myself.
Windows 10 guests running in Player 15 did not exhibit this behavior, nor any of the above-mentioned other failures.
Clearly, the VMware devs have their work cut out for them to fix this flaw.
Pending a bug fix, any suggestions of further configuration to eliminate this issue would be especially well received.
A solution seems to exist. This turns out to be a host IOMMU issue.
Notwithstanding that Intel VT-x and VT-d are enabled in the host firmware, kernel version 5.15.0-46-generic disables IOMMU by default. The kernel is configured with:
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
That is, intel_iommu is compiled into the kernel but not (or no longer is) enabled by default (note the octothorpe commenting out the second configuration option).
Adding the following to the kernel command line fixes the guest freezes and kcompactd0 no longer pins its CPU core at 100%, at least thus far:
intel_iommu=on
So, for GRUB, edit /etc/default/grub to add the above string to GRUB_CMDLINE_LINUX_DEFAULT, e.g.,
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"
Save and close the file, then run:
# update-grub
Reboot to take effect.
For systemd-boot, either (a) add the above string to a separate line in /etc/kernel/cmdline or (b) add the following to your boot entry .conf file in /loader/entries:
options intel_iommu=on
Save and close the file, then reboot to take effect.
Tested with vmware workstation 16.2.4, ubuntu 22.04 host, win10 guest on intel alder lake platform. The "intel_iommu=on" does NOT work.
@jiaxinslee, think you're right.
This hack, like others that have been suggested in various posts to address this phenomenon, e.g., fiddling with vm.compaction_proactiveness and vm.swappiness, turns out not to have eliminated the phenomenon entirely.
The hack did eliminate the phenomenon for a few days, and it also continues to improve performance. Nevertheless, the behavior returned, with intermittent episodes of kcompactd0 pinning its CPU core at 100%, freezing the guest, albeit not the host. The duration of the episodes is significantly less than before, however, perhaps only a minute or two instead of ten or twenty, or even hours.
One bit of memory jiu-jitsu I'm experimenting with is a host tmpfs share. Specifically, I createe a tmpfs on a host mount point via fstab. The host serves files via Samba, so the mount point is within an existing Samba share. Then, on the Windows 10 guest, I created a directory symbolic link to the host's tmpfs share (mklink /d C:\tmp \\server\share\tmpfs-mount-point) and edited %TMP% and %TEMP% to refer to the symlink. An alternative configuration would map a drive letter, e.g., T:\, to the tmpfs share and could be better for the purpose. Either way, this effectively gives me a RAM mount for temp files on the Windows guest's filesystem. It definitely has improved performance and responsiveness and we'll see whether it has any effect on the kcompactd0 behavior.
Another few hacks that have improved and perhaps fixed this, at least so far.
Doubtless, this is a memory management issue requiring both guest and host tweaks.
I got a big improvement with two guest tweaks, first by manually defining pagefile.sys size in System Properties (it was system-managed and >10GB; I reduced it to ~2GB and min=max), and second, by reconfiguring superfetch on the Windows 10 guest for boot only, not boot and applications. This prevents superfetch from loading all of one's commonly-used apps into memory in advance, substantially reducing dirty pages. So:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters]
"EnableSuperfetch"=dword:00000002
Meanwhile, on the Linux host, and in addition to the foregoing IOMMU tweaks, I've set the following kernel parameters in sysctl.conf:
vm.compaction_proactiveness=0
vm.swappiness=1
These two parameter settings may be overkill in light of other kernel parameters that configure memory compaction and memory reclaim but (a) they work; and (b) I don't have time to figure out the other parameters.
