VMware Communities
jgreenwald235
Contributor
Contributor
Jump to solution

Best/Fastest Hardware for Workstation Pro 14 for multiple VMs

I need to run multiple VMWare Workstation 14 VMs at one time. Generally about 7 at once, but could reach 15 in some cases. Generally each VM will have one CPU assigned to it.

The VMs are used to run Oracle database servers and WebLogic Server instances for demo purposes when I train students. I only run Oracle Linux in all my VMs. Ideally I'd like to be able to start up JVM-based WebLogic server instances in the VMs quickly during class demos.

My current setup is:

I7-6700k PC (4 cores) with 64GB RAM and SSDs and PCI-based m.2 ssd (not nvme yet) to run the VMs.

Dell R710 server - dual 6 core Xeon E5660 2.8 Ghz CPUs with 108 GB RAM and SSDs for running the VMs and 4.2 TB 10k VM storage & backup.

So I have 16 total cores (32 threads) currently.

The two machines are connected via an 10gbe network (point to point) so I can copy VMs between the machines quickly, as some VMs can reach 60GB file size.

Point of this query is that I find the Dell to be much slower in starting up the VMs and JVMs inside the VMs (and, possibly, I'm simply impatient) and not as "responsive" when using the Linux GUI - as the VMs running on the 6700K. While that is not surprising, the differences are pretty severe. Starting a WebLogic JVM can take 3-4 mins or more on the Dell, but under 2 minutes on the 6700K. I have found that adding more CPUs to the Dell VMs helps some - and I do have cores to burn on the Dell. I have not switched on Turbo Mode taking the Xeons to 3.2GHz on the Dell yet. I plan to try and test that shortly.

I am thinking about upgrading the machines and maybe replacing the Dell. I am considering:

Replace I7-6700K with an I7-8700K = 6 cores

Replace the Dell with an I7-7820x = 8 cores

For a total of 14 cores.

AMD is not an option as there are issues of VMware Workstation Compatibility with Linux guests and I do use Virtual Box as well, which clearly states "Intel HW Only".

Though my total core count would drop by 2  - they would be much faster than the Dell. And I would not need to keep the 7820x in a closet - which is where the Dell now lives due to fan noise.

I'm trying to decide if it's worth replacing the machines - doing the PC builds myself and hunting for deals on parts I'm looking at almost $3,000. I would reuse some parts - PSU, RAM, SSDs, part out the Dell and 6700K on ebay to offset the costs. But I'm still looking at probably $3,000. Not out of the budget if the return is worth it. I'm not sure it is and that why I'm here asking your help.

One option is to keep the Dell and replace the Xeon 5660 with used Xeon 5690s for about $230 - Turbo Mode is 3.73GHz - so almost a full 1 GHz faster in normal mode - and 1/2 GHz faster than the 5660 in Turbo. But a big jump in Watts (95 - 130 per CPU).

Right now I'm leaning towards keeping the 6700K and building the I7-2870x PC (about $1600) to get to 12 fast cores - which will handle 80-90% of my demos and both can live in my office near me and keep the dell for the few demos that need 15+ cores.

Your thoughts are appreciated.

0 Kudos
1 Solution

Accepted Solutions
bluefirestorm
Champion
Champion
Jump to solution

I would say keep the i7-6700K for now; or at the very least don't change to Coffee Lake i7-8700 as you would likely encounter the same problem in this thread

Kernel Panic at boot time, using Oracle Linux 6.7

I don't think VMware Workstation 14.x supports Coffee Lake CPUs yet so certain virtualisation features may not work fully.

Speed of VM execution isn't just about cores and clock speed. It is also about minimising VMEXIT situations. VMEXIT occurs when a VM has to hand control back to the hypervisor which incurs CPU cycles. Then there is also the inverse which is VMENTRY.

I can't find any official Intel document publishing the estimated CPU cycles required for VMENTRY/VMEXIT per CPU generation/family. But what I read is that the CPU cycles required for these has been reduced signficantly over the different generations of CPU. So a Westmere Xeon CPU will likely incur a couple of thousand of CPU cycles while something from a Skylake desktop CPU might just incur less than a thousand for each VMEXIT.

Furthermore, there are features that prevent these VMEXITs in the first place.

I would think you replace first the Dell R710; if possible with a Haswell or later Xeon although the Skylake i7-7820X might be OK as well. What I find from looking at vmware.log in this forum is that not all CPUs are created equal; not even in the same generation/family. One feature called "virtual interrupt delivery" seem to be available only on Ivy Bridge and later Xeon and I don't see this in Ivy Bridge and later desktop/mobile chips.

Virtual interrupt delivery is another feature that prevents a VMEXIT situation.

I haven't seen any vmware.log yet using Skylake i7-7820X, so it will suffer the same problem as the Coffee Lake CPU if the family/model/stepping is significantly different from regular Skylake desktop/mobile/Xeon chips.

The reason for the Haswell or later Xeon recommendation is because Haswell introduced the INVPCID instruction which mitigates against the performance hit with Meltdown patch. Without INVPCID available inside the VMs, VMEXITs will also occur. From what I understand the Linux kernel patches also rely on INVPCID instruction in a similar way that Windows Meltdown patch does.

Having said all that, I would think it is the absence of the INVPCID instruction, more VMEXITs, and more CPU cycles required to handle VMENTRY/VMEXIT that makes the Dell R710 slower than the i7-6700K. So don't bother to upgrade the Dell R710 Xeon as you are simply replacing a Westmere Xeon with another Westmere Xeon.

View solution in original post

8 Replies
bluefirestorm
Champion
Champion
Jump to solution

I would say keep the i7-6700K for now; or at the very least don't change to Coffee Lake i7-8700 as you would likely encounter the same problem in this thread

Kernel Panic at boot time, using Oracle Linux 6.7

I don't think VMware Workstation 14.x supports Coffee Lake CPUs yet so certain virtualisation features may not work fully.

Speed of VM execution isn't just about cores and clock speed. It is also about minimising VMEXIT situations. VMEXIT occurs when a VM has to hand control back to the hypervisor which incurs CPU cycles. Then there is also the inverse which is VMENTRY.

I can't find any official Intel document publishing the estimated CPU cycles required for VMENTRY/VMEXIT per CPU generation/family. But what I read is that the CPU cycles required for these has been reduced signficantly over the different generations of CPU. So a Westmere Xeon CPU will likely incur a couple of thousand of CPU cycles while something from a Skylake desktop CPU might just incur less than a thousand for each VMEXIT.

Furthermore, there are features that prevent these VMEXITs in the first place.

I would think you replace first the Dell R710; if possible with a Haswell or later Xeon although the Skylake i7-7820X might be OK as well. What I find from looking at vmware.log in this forum is that not all CPUs are created equal; not even in the same generation/family. One feature called "virtual interrupt delivery" seem to be available only on Ivy Bridge and later Xeon and I don't see this in Ivy Bridge and later desktop/mobile chips.

Virtual interrupt delivery is another feature that prevents a VMEXIT situation.

I haven't seen any vmware.log yet using Skylake i7-7820X, so it will suffer the same problem as the Coffee Lake CPU if the family/model/stepping is significantly different from regular Skylake desktop/mobile/Xeon chips.

The reason for the Haswell or later Xeon recommendation is because Haswell introduced the INVPCID instruction which mitigates against the performance hit with Meltdown patch. Without INVPCID available inside the VMs, VMEXITs will also occur. From what I understand the Linux kernel patches also rely on INVPCID instruction in a similar way that Windows Meltdown patch does.

Having said all that, I would think it is the absence of the INVPCID instruction, more VMEXITs, and more CPU cycles required to handle VMENTRY/VMEXIT that makes the Dell R710 slower than the i7-6700K. So don't bother to upgrade the Dell R710 Xeon as you are simply replacing a Westmere Xeon with another Westmere Xeon.

jgreenwald235
Contributor
Contributor
Jump to solution

Thank you for your thoughtful reply.

I am considering a Single Intel Xeon E5-2690 v2 Ten Core 3.0GHz  w/ turbo of 3.6 - which seems to perform well and is with budge when purchased as part of a refurb Dell R720. And I can use the existing drives and memory from the dell R710 to help keep costs down.

0 Kudos
bluefirestorm
Champion
Champion
Jump to solution

The Intel ARK comparison can only show the difference.

https://ark.intel.com/compare/75279,47921

Currently you have a total of 32 (8 + 2 x 12) logical CPUs from i7-6700K (4C/8T), dual X5660 (6C/12T),

Changing from dual X5660 R710 to dual E5-2690v2 (10C/20T) R720, you will get a total of 48 logical CPUs (8 + 2 x 20).

If you go with a single E5-2690 v2, you get a total of 28 logical CPUs (8 + 20), only 4 less than current configuration.

If most of the 15 VMs are single CPU and idle most of the time and the grunt work is really done on the Oracle database server VM and Weblogic Server VM, you might be able to get away with just a single E5-2690 v2.

Aside from the lesser CPU cycles required for VMENTRY/VMEXIT and virtual-interrupt delivery with a change to Ivy Bridge Xeon, you should be able to have 1GB page support with the Ivy Bridge Xeon. I don't think Westmere CPUs supports 1GB size huge pages.

You can check the vmware.log of any VM in the current R710 whether it indicates "yes" or "no".

vmx| I125:   1GB super-page                    no

The 1GB pages will likely benefit the Oracle database server and Weblogic VM but you might have to specifically configure the guest OS for it. Huge page require less entries on the TLBs and make it less likely to be flushed out (thus preventing VMEXIT).

jgreenwald235
Contributor
Contributor
Jump to solution

Thank you, I will consider the 2960 v2.

I am still interested in  the 7820x for a desktop PC vs a server for the 2960.

The ARK comparison shows - while fewer cores and more cost, much better speed. Build out for both is about the same cost and the 2960 does open the door to more RAM. And it can use a Noctua D15, so noise s/not be an issue.

And, what are your thoughts of cores vs threads.
Since my VMs are mostly inactive except for brief periods when a request is made to a server or aI start JVM, I'm not making heavy demands on the VMs. One exception would benefit from multiple cores, but I can scale that back if needed.

So I'm still leaning towards the 7820x

0 Kudos
bluefirestorm
Champion
Champion
Jump to solution

Being Skylake CPU, the i7-7820X will have the INVPCID instruction which mitigates against performance hit on disk/network I/O intensive tasks that come with the Meltdown patch. That is one thing goes against the Ivy Bridge E5-2690 Xeon as it does not have the INVPCID instruction.

The i7-7820X was released in Q2 2017. A lot of Kaby Lake CPUs were release Q1 2017. If you don't have a need for DTrace and/or virtualised performance counter inside the VMs, you can take the chance of switching to i7-7820X instead of an Ivy Bridge Xeon. That way, in the event Workstation 14.x does not properly recognise it, you won't run into significant problems just as Kaby Lake/Coffee Lake CPUs have some VPMC problems with Workstation Pro 12.5.x.

As for cores vs threads, I can't find any Intel documentation that explicitly says so but I would think a hyperthread cannot break to serve another VM if its partner thread in the same core is already executing for a particular VM. From Nehalem microarchitecture to Skylake microarchitecture, this seems to remain the same (each core and therefore 2 threads where hyperthreading is available shares the same execution pipeline and L1 cache).

0 Kudos
jgreenwald235
Contributor
Contributor
Jump to solution

Thank you, that is my understanding about threads vs cores as well. I could see adding 1 core and 2 threads to a VM if needed and that might work, I don't think I can spilt threads across VMs.

Your answer has me still leaning towards the 7820x - one reason is that it puts me into a motherboard that can support the I7-79xx processors, so I could  upgrade to more cores later, if needed.

0 Kudos
jgreenwald235
Contributor
Contributor
Jump to solution

In case you're interested;

I tested two types of VMs on my I7-6700 and my older Mac Pro - which has dual 5690s - faster - and quieter - than the Dell and usually reserved for video editing.

One type of VM had a large WebLogic installation with complicated applications installed. The other a fresh, simple install.

Differences in performance shows the Xeons lagging by 50 - 70 percent in some cases. Useable once up and running, but a lot longer to get the WebLogic-based JVMs started.

Using CPU Benchmark Single Core numbers - the 7820x comes out ahead of the 6700k (2480 vs 2351 - maybe not so significant) and way ahead of the xeons 5690 and 5660 (1508 and 1272).

So, replacing the Dell with the 7820 - may be the best option - give up the slower processors. And, while I loose 4 cores, I gain it back in speed. And the Mac Pro picks up the slack running the "back end" VMs - those least visible wrt to speed: sw load balancer, db, LDAP, file server, web servers - once up they stay up with little use. The visible VMs that see more startup and shut down would live on the 6700K, 7820x and my MacBook Pro (I7-4960 cpu benchmark just under the 6700k).

0 Kudos
bluefirestorm
Champion
Champion
Jump to solution

It would be reasonable to expect the Ivy Bridge Xeon would perform somewhere in between the i7-6700K and Westmere Xeon but my guess the needle will likely be nearer to the Westmere Xeon than to the i7-6700K. So I agree the i7-7820X is a good choice despite losing core count from the previous Dell R710 configuration.

The i7-4960HQ in the MacBook Pro is a Haswell CPU, so it will have the Haswell improvements with virtualisation that lowers the frequency/need for VMEXITs aside from the INVPCID instruction.

I got to look at a vmware.log of a Westmere Xeon CPU yesterday, it looks like it should support 1GB pages.

| vmx| I125:   1GB super-page                   yes

| vmx| I125: Capability Found: cpuid.PDPE1GB = 0x1

Perhaps it is worth it to try configuring HUGEPAGES for the Linux VMs in the Mac Pro? In theory, HUGEPAGES should make it less likely for the TLB for the cleared; which in turn prevents a VMEXIT. I must say I haven't tried configuring HUGEPAGES for any Linux (VM or physical) and I considering myself to be a beginner with Linux. So it is entirely optional; no pressure for you to try it as it may be something that requires a non-trivial effort to test/compare.