RussHuffman
Contributor
Contributor

Windows host OS preferences. W10 Pro, Enterprise, or Server 2016

I've been running VMWare for several years with a Win8 Enterprise host.  The host did nothing except run VMWare, so after killing off that Metro UI, it never took much of my attention, and things worked well.  I use VMs for lots of different things including multiple simultaneously running VMs.  I typically run several Visual Studio dev environments, another for VPN, another for "general Web access/search/etc" (disposable), and another for office type apps.  Often 6 or more at a time.  I'm running a pretty high spec machine, so this was fine for my needs.  (dxdiag.txt attached)

I recently upgraded my host SSD, and decided to install Win10 Pro for the host.  No real reason, just ready to forget W8 as much as possible.  The problem is, performance of my VMs has suffered substantially.

I'm wondering if the problem is with the host.  Maybe moving to W10?  Or maybe it's Pro vs Ent?  Or if I would be better off with Server 2016. 

As stated, this is used for nothing except running VMWare to host VMs.  Nothing else is installed on the host VM, including AV.  And the VMs don't run any of the problematic AVs.  I also have licenses for all MS OS variants, so that's not an issue.  All VMs are on SSDs, and their distribution has not changed. 

The biggest place I notice a difference is when running my Dev VMs.  Builds, debugging, profiling, pretty much everything is slower.  Maybe 40% by seat of the pants, and that's with the VM in question as foreground.  I also have noticeably slower transfers of large numbers of files from one to another (obviously at least 1 being background).  And of course I sometimes do start long running tasks in one VM and then switch to another while it does whatever it does.

All that said, what are the suggestions for addressing this performance issue?  Can W10 Pro be tweaked to improve things?  Or I can easily repave the host SSD and move to W10 Ent or S2106.

0 Kudos
7 Replies
bluefirestorm
Champion
Champion

The choice of host operating system generally does not affect virtual CPU performance as the virtual CPU functionality is handled through Intel hardware-assisted virtualisation calls and not host OS system API calls.

The difference in performance (if ever noticeable/visible), will likely be seen in virtual 3D graphics - handled through DX11 by default on Windows hosts and the virtual disks and virtual switches as would need to use the host OS calls to access the virtual disks (likely exception is VM using raw disks/partitions); and the virtual switches (VMnet0, VMnet1, etc).

All three choices you laid out: Windows 10 Pro, Enterprise or Windows 2016 will likely have similar under-the-hood capabilities.

As to difference in performance, is their antivirus such as AVG or Avast running on the host? If yes, disable the hardware virtualisation features of these.

https://communities.vmware.com/message/2664564#2664564

https://communities.vmware.com/message/2667462#2667462

Are the virtual disks preallocated with fixed sizes, fixed size split into multiple files, or just left to grow on its own?

Your CPU is an Ivy Bridge i7-3770 which does not have the INVPCID instruction (available only on Haswell and newer). The INVPCID instruction is required by Windows 10 to mitigate against the performance hit due to the Meltdown patch. The hit is mostly on network/disk I/O as the context switch forces a flushing of the TLB without the INVPCID instruction. The debugging might also get hit as it will likely have to switch context between the debugger and the process being debugged.

https://support.microsoft.com/en-us/help/4074629/understanding-the-output-of-get-speculationcontrols...

You could try disabling the Meltdown patch (in both host and VMs) and see if it makes any difference. The Meltdown patch without INVPCID instruction inside VMs can cause another problem as flushing of the virtual TLB will likely result in a VMEXIT (handing control back to hypervisor) and this consumes additional CPU cycles.

https://support.microsoft.com/en-us/help/4072698/windows-server-guidance-to-protect-against-the-spec...

0 Kudos
RussHuffman
Contributor
Contributor

Regarding the host OS type relevance, I at first suspected as much, which is why I just went with W10 Pro, but wasn't sure given my observations.

Before posting I had read all about the AV issues.  There has never been an AV on the host (other than the preinstalled Defender).  And most VMs don't have AV either.

The most performance dependent VMs have pre-allocated disks in a single file.  For instance, all my dev VMs.  Those less disk intensive or performance sensitive have variable sized.  I periodically defrag and compact/clean-up, and avoid accumulating snap-shots, all of which have seemed to positively affect various performance aspects in the past, but has made no real difference to the slow down I noticed immediately after changing out the main SSD and installing W10 to be the new host OS.

Yes, I still have an older i7, but that didn't change.  However, I wonder about the other point you made regarding INVPCID and Meltdown.  I don't think that would have led to a quite noticeable change, but given as many VMs as I use concurrently (at least one active on each of 3 monitors at any time), I suppose it might have been amplified?  I don't think it would contribute to the debugging specifically since I don't debug on the host, but rather just within (or across) VMs, which would be managed by the internal virtual CPUs.

I'll test out the disable, but it sure seems unlikely to account for what I've noticed.

0 Kudos
RussHuffman
Contributor
Contributor

Sorry for the delay, but it took a while before I had time to dedicate to this.  And from earlier review, it looked like it was something I was going to have to take time to figure out.  Unfortunately, this appears it may be exceeding my grasp.  Best I can tell, these mitigation are either not installed, or are being disabled by the OS detecting the missing hardware/microcode support.  If I'm wrong, I think I may need a bit more guidance in this area.

For additional reference, running the script indicated in your link produces the following results.

BTIHardwarePresent             : False

BTIWindowsSupportPresent       : True

BTIWindowsSupportEnabled       : False

BTIDisabledBySystemPolicy      : False

BTIDisabledByNoHardwareSupport : True

KVAShadowRequired              : True

KVAShadowWindowsSupportPresent : True

KVAShadowWindowsSupportEnabled : True

KVAShadowPcidEnabled           : False

I'm actually considering returning to my previous host OS of Win 8.0 Enterprise.  I'm not entirely sure why, but my VMs (inducing W10) performed quite acceptably with regard to performance before I replaced my host SSD and decided to move to a W10 Pro host.  I still haven't presupposed that old SSD, so I might even just stick it in as a boot drive and see if something else (like an update to the W10 guest OS as described?) might have simply coincided with the host update leading to an incorrect association with the performance degradation.  Frustrating...

0 Kudos
bluefirestorm
Champion
Champion

The BTI section is for Spectre. It is ineffective as there is no microcode detected as indicated by BTIHardwarePresent as False

BTIHardwarePresent             : False (means the Intel CPU microcode for Spectre mitigation is not available)

BTIWindowsSupportPresent       : True

BTIWindowsSupportEnabled       : False

BTIDisabledBySystemPolicy      : False

BTIDisabledByNoHardwareSupport : True (because the Intel microcode is not available the Spectre mitigation is disabled/ineffective in the OS)

On the other hand, for Meltdown, it is purely an OS update that requires it to work but since the CPU is Ivy Bridge it shows as "KVAShadowPcidEnabled" as False as it does not have the INVPCID instruction thus will have the potential hit in performance for specific wokload scenarios.

KVAShadowRequired              : True

KVAShadowWindowsSupportPresent : True

KVAShadowWindowsSupportEnabled : True (means Meltdown patch is enabled)

KVAShadowPcidEnabled           : False (value is False because Ivy Bridge and older CPUs don't have the INVPCID instruction)

All the Microsoft articles indicate Spectre/Meltdown patch availability for Windows 8.1 but not Windows 8 (with the exception for Embedded). So it might indeed be the case that you had experienced the performance hit with the Meltdown patch enabled in Windows 10 while switching back to Windows 8 takes its away.

Even though you don't debug on the host, as indicated earlier, the potential performance hit from the Meltdown can be felt in both host and VM. So if the VM also has the Meltdown patch enabled as well as the host; without the INVPCID instruction you can get multiple layers of hits to performance: context switch between debugger and process being debug, which then causes a VMEXIT; on top of that you mention you debug across VMs, so you have the hit at the VM network I/O and VM transitions between two VMs (transition out from VM1 to host and host transition to VM2, VM2 transition back to host, host transition back to VM1).

0 Kudos
RussHuffman
Contributor
Contributor

I've been muddling over whether to risk disabling the patch, or just give up on this trusted work horse and build a new system.  This one has more than earned its keep over the years, so maybe it's time to build again.  A new fast 6 core with 32 (maybe 64?) GB DDR4 RAM and M.2-M SSDs would surely perk things up again. 

Do you guys have hardware recommendations or warnings for someone that lives completely in VMs all day doing some pretty involved software dev work?  It's not at all unusual for me to have 2 dev VMs interactive with another running some long processes in the background at the same time.  The worst part about building a new machine is figuring out exactly what configuration to build to last for the long haul (like this one has).

0 Kudos
bluefirestorm
Champion
Champion

It takes more than just cores and clock speed to have good VM performance. If it were entirely down to clock speed and cores, the performance would have plateaued long time ago when the base clock rate essentially didn't go above 4GHz and core count pretty much stayed at 4 for consumer desktop/mobile CPUs.

If you compare for example a Coffee Lake i7-8700 and i7-3770, you can hardly see the things that will make a difference from the Intel ARK site. In fact the base clock speed is slower.

https://ark.intel.com/compare/126686,65719

The core count and amount of RAM has a greater effect on how many VMs you can run simultaneously, comfortably without the fans running at full speed. 16GB RAM might be even sufficient even if two VMs are powered up simultaneously with 8GB allocated to it. There is a point of diminishing returns with RAM (both physical and virtual).

It is hard to give advice on a specific CPU to use and it is hard to predict the future as to how CPU technology will advance or regress. Part of the problem is Intel does not publish specifications on certain things (maybe they do and I am not looking at the right places or maybe it is behind an Intel registered site). One of the things that Intel has improved on is the amount of cycles it requires for VM transitions. So a Coffee Lake CPU will likely require less CPU cycles for a VM transition than an Ivy Bridge CPU. The frequency/need fo such VM transitions have also been reduced. It is these types of improvements that can make a 2.5GHz Haswell CPU have VMs likely outperform than if it were running on 3.6 GHz Westmere Xeon (just for the sake of example, I didn't make any actual benchmark).

One warning about Kaby Lake/Coffee Lake CPUs, you may need to upgrade to Workstation Pro 14.x if you are still on an earlier version Workstation software. If your development VMs require enabling virtualised performance counters, VMs will not be able to power up under Workstation 12.x or earlier with Kaby Lake/Coffee Lake CPUs. I cannot confirm whether Coffee Lake CPU is already supported in 14.1.2 but I see the Coffee Lake string is now the vmware-vmx.exe of version 14.1.2.

0 Kudos
RussHuffman
Contributor
Contributor

Thanks for the explanation. I'm aware of the frequency plateau and alternate paths used to continue improving performance. 

About a week back I upgraded my dev machine to an i7-8700 CPU with 32GB of fast ram.  And again, all my VMs run from recent generation Samsung SSDs.  So unless I've missed something, i think that's about as much VM hardware based performance as can be reasonably ($$$$) expected for a dev desktop.  I'm also running Workstation 14.1.2 build-8497320. 

At this point I'm back to tolerable performance when running my typical configuration.  That happens to be 2-3 dev VMs and another 3-4 less well endowed VMs (web access, email, Office Apps, VPN, etc).  Too much time has now passed since things went wrong, but it still doesn't feel as snappy as it was before I updated the host OS SSD and changed host OS from W8-Ent to W10-Pro.  So that could be my biased impression from it being very slow for a time.

0 Kudos