Good afternoon community!
I've been testing out WS-TP-22H2 for a couple of months in conjunction with Workstation 16.2.x variants for more than a few months on some newer desktop tech. Before I get to firing off a question or three, here's the setup.
Existing infrastructure is Workstation Pro 16.2.x on all client machines. Machines are 5-7 years old (give or take) running Intel procs between 4th to 7th generation (i5/i7 4xxx - 7xxx). Host machines have 32GB of RAM (yes, it's a bunch, but that's beside the point), NVMe SSD's, and Windows 10 Enterprise 21H1 or newer. Two virtual machines are typically run-on top of the host in concurrency. The VM's have two cores each assigned along with 12GB of RAM each and adequate space on the SSD.
Performance on the existing virtual machines for the most part has been uneventful which I can't ask for anything better. User tickets have been nearly nonexistent under the performance category until fairly recently, but we are due for a hardware refresh.
This segues into the purpose of this post. We've been testing out Windows 11 Enterprise on various hardware manufacturer's desktop PC's. Prior to Intel's 12th generation Alder Lake cores where there is a P and E split to the cores, there's not much to report from the status quo there. Everything works similar to the older machines at a slightly higher rate of performance and throughput. However, we ran into the "E-core blockade" problem other users mentioned over the past year or so with Alder Lake. When a virtual machine is created and powered on, if any of the E-cores are enabled and functional at the host BIOS level, the performance of the virtual machine is so atrocious it's well into boat anchor territory. Average power on to Windows 11 login screen times were well over an hour, and the Ctrl+Alt+Insert command to dismiss the lock screen took 7 minutes to execute. I won't get into the metrics for actually logging in and trying to open up Microsoft Word. Watching the host task manager's logical CPU performance graphs had all allocated cores pretty well continuously at 100% usage nonstop. This didn't matter how many cores were allocated as they were all maxed out regardless of quantity allocated.
We were able to have success with Lenovo brand Alder Lake machines as well as ASUS motherboards (BIOS) in that they allowed the E-cores to be disabled in the host BIOS. Once we did that, the VM's booted up just as fast as an 11th gen Intel processor, host logical CPU usage was normal (and never maxed out), and at this point in the testing, everything else was pretty well good across the board. I've seen other threads detailing pinning VM cores to host processors which isn't entirely supported by VMWare. We didn't try that as we wanted to stay within VMWare's technical support field of play going forward. With that limitation, disabling the E-cores appears to have fixed this problem, but I'm hoping this won't be necessary going forward. There was no difference in performance between Workstation 16.2.4 and the technical preview 22H2 with the aforementioned other variables being the same. The takeaway here is in agreement with previous posts in that if an E-core is allocated to the virtual machine, all allocated cores are treated as E-cores regardless of what they actually are, or that something to that effect occurs where we can see all allocated local cores continuously maxed out. As a side note, side channel mitigation enabled, or disabled did not affect performance to this point.
I have a couple of users stress testing 22H2 on Windows 11 Enterprise now (only P-cores enabled on the host), and there's one lingering performance issue which I'm wondering is due to side channel mitigation being enabled. Each user has reported the same problem where randomly during program opening or execution within the VM (could be a web browser opening up, or while typing into an Excel spreadsheet for example) the entire virtual machine will lock up for about 5 seconds, and then release and carry on. Occasionally the virtual machines UI will flood to white or black (more often white) while the freeze occurs. This seems to happen anywhere from 3-7 times a day (some outliers being more frequent). The host doesn't have much going on with it, CPU usage is fairly low as well as RAM and disk usage being low while this occurs. The host does not freeze when the VM does, the host remains totally fine and able to function 100% while the VM is frozen.
The new machines specs are an i7-12700, 32 GB of RAM and an NVMe SSD. Host and VM OS are Windows 11 22H2 Enterprise. The VM has 6 cores assigned along with 16 GB of RAM. Changing core or RAM size does not change freezing behavior. I'll also note that only the one VM is running along with the host while we've been testing this. We have not had both VM's on together yet at this point in testing. The Lenovo machines will full freeze for 5ish seconds as I've mentioned, and the ASUS machines will flicker-freeze for all of half a second, so there seems to be a better handling on the ASUS machines to whatever this problem is.
Has anyone else experienced the freezing I'm mentioning? We will be testing without side channel mitigation tomorrow, but for security reasons, I'd obviously like to keep that feature turned on.
Otherwise so far everything else seems to be fairly status quo!
Thank you for your time.
Here's an update to the previous post.
Disabling side channel mitigation did not affect the performance of the VM insofar as I can tell. I also disabled Accelerating 3D graphics just in case this was a visual issue. Both methods did not solve the hanging problem.