7 Replies Latest reply on Dec 9, 2019 1:53 AM by roblevine

    SSD Performance Degradation in VMWare Workstation Guest

    codecraft Novice

      Problem:

       

      2x PCI NMVE (Samsung SM960) SSDs in RAID 0 benchmark significantly slower in guest O/S than host O/S.

      Host and guest are Windows 10, updates current as of today.

      VMWare Workstation 14, updates current as of today.

      Disks are split/not preallocated.

      Results in attached PDF.

      Disks are not independent, there are no snapshots and I see no significant difference between independent and non-independent.

       

      I expect some performance trade off in the virtual machine, probably of the order of 10% or so.

       

      10% is an assumption but I don't expect the guest to be orders of magnitude worse than the host.

       

      On some of the tests the guest throughput is 1/3 or less than that of the host.

       

      I have tested this for various disk types; none get anywhere near the reported host performance.

       

      Is there anything that can be done from a topological, system settings or drivers perspective to improve the guest throughput?

       

      ( Updated to correct a result in the attached PDF. )

        • 1. Re: SSD Performance Degradation in VMWare Workstation Guest
          roblevine Novice

          This sounds very similar to my issue here: Very poor SSD performance inside Windows 10 guest running on VMWare Workstation on a Windows 10 host

           

          I got an initial "bite" from a VMWare employee who asked a question, but have had nothing since. It makes guests pretty unusable for my developer workflows.

           

          I've got to the point where I'm going to stop using VMWare Workstation for development as the performance is so poor. Pity - I've been developing this way since version 5, but nothing I've tried has improved the performance.

          • 2. Re: SSD Performance Degradation in VMWare Workstation Guest
            louyo Master

            OK, my $0.02 (I am just a user who dwells in the slow group):
            (I firmly believe that most, if not all, benchmark programs are written with an axe to grind)
            I see very good performance on my W10 VM's running on my Linux host. At first boot up, it is sluggish for a few minutes as W10 phones home and gossips for a while. Then, I see very little difference from performance at client's W10 systems. Sure, if I measure it it is bound to be different. I do not run W10 native. I run via ESXi or WS 14 on Linux. No religious war intended.

            If performance was so bad that WS was unusable, folks would be in here piling on like crazy. So, in my alleged mind, something is amiss. Oranges and apples, but here are some interesting results from my system, running Mint  19 with a Mint 19 VM: No benchmark program but writing and then reading back 1GB bytes of zeros, clearing caches in between. Host writes at 1.0 GB/s and VM writes at .971 GB/s. Oddly enough, the VM reads at 2x the speed of the host (first pass).

            Host:
            lou@T5810:~$ sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
            lou@T5810:~$ dd if=/dev/zero of=./largefile bs=1M count=1024

            1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.05241 s, 1.0 GB/s
            lou@T5810:~$ sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
            lou@T5810:~$ dd if=./largefile of=/dev/null bs=4k
            262144+0 records in
            262144+0 records out
            1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.94095 s, 553 MB/s

            VM:

            lou@Mint19VM:~$ sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
            lou@Mint19VM:~$ dd if=/dev/zero of=./largefile bs=1M count=1024
            1024+0 records in
            1024+0 records out
            1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.10613 s, 971 MB/s
            lou@Mint19VM:~$ sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
            lou@Mint19VM:~$ dd if=./largefile of=/dev/null bs=4k
            262144+0 records in
            262144+0 records out
            1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.953116 s, 1.1 GB/s

            write is 1.0 vs .97, some overhead
            read is .550 vs 1.1

            HUH?
            VM is 2x read speed of host?

            so, clear the cache on the host before read on VM
            then:

            lou@Mint19VM:~$ sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
            lou@Mint19VM:~$ dd if=./largefile of=/dev/null bs=1M
            1024+0 records in
            1024+0 records out
            1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.9238 s, 558 MB/s

            So caching makes a difference on both vm and host, caching depends on available memory.
            Here, I only have 2 VM's running and top shows:
            iB Mem : 32568688 total, 22393852 free,  3138420 used,  7036416 buff/cache
            KiB Swap: 14884856 total, 14884856 free,        0 used. 24381940 avail Mem

            plenty of memory for buffering/caching. Remember
            , drives on the VM are just files on the host. How much free memory do you have? Are you sure there is no swapping? Are your VM's on separate drives?

            How about AV programs? Do you have them configured to leave the Workstation files alone? If they examine them each time they are read/written, that would be significant overhead.

            -- The universe is composed of electrons, neutrons, protons and......morons. ¯\_(ツ)_/¯
            • 3. Re: SSD Performance Degradation in VMWare Workstation Guest
              codecraft Novice

              My experience is that bench-marking tools are there to help users measure system (and relative) performance.  If there is some evidence that CrystalDiskMark has some nefarious agenda, certainly offer that and we can see if there is some alternative tool to use.  The reason I used the tool is because I've seen it widely used in other bench-mark scenarios, so believe it to be trusted for measuring disk performance.

               

              The Linux comparison noted above in which the VM initially appears faster than the host isn't really helpful in the context of users querying why their host is ~ 4x the throughput of the VM.  What is demonstrated is that the performance is roughly equivalent.  That would be a nice problem to have.

               

              W10 host and W10 guest are running off the same RAID0 PCI NVMEs.  Host W10 64Gb, no swap.  Guest W10 32GB no swap.  No other notable system load (e.g. host and guest mainly idle).

               

              4 guest test disks used have write cache disabled.  However, I see no notable difference with the primary guest OS disk that has write cache enabled.  Host has write cache enabled.

               

              If we want to discount write caching, we can focus purely on the read throughput.  Read caching on host and guest is whatever W10 does OOB.  Nothing here seems to justify 1/3 or worse throughput in the guest versus the host.

               

              VM CPU is not being maxed out during the tests, so I doubt there is a CPU issue in play.  No swap in play.  Both host and guest have vast amounts of free RAM, so RAM should not be in play.

               

              Seems to be more an issue of how VMWare is passing/translating I/O between the guest and host, or a bottleneck with throughput between host and guest.

              • 4. Re: SSD Performance Degradation in VMWare Workstation Guest
                codecraft Novice

                If the issue here is one of translation between guest and host I/O requests, additional questions:

                • What does a RAID0 PCI NVMe (2x) array with Intel Matrix Storage Driver look like to VMWare?
                • Should VMWare be able to pass through requests for an NVMe guest disk to the host without translation?
                • Could this be related to the maturity of the VMWare NVMe driver(s)?
                • 6. Re: SSD Performance Degradation in VMWare Workstation Guest
                  TimothyHuckabay Novice

                  I posted this important information in January 2019, and apparently some unhappy individual (presumably with VMWare) foolishly removed it, so here it is again.  It is just as relevant today under version 15.5.x, as it was when I posted it back in January 2019:

                   

                  As of version 15.0.2 of Workstation, this is still NOT fixed.  VMware's claim of "performance improvements for virtual NVMe storage" are just hot air.  I have extensively tested the latest versions of Workstation Pro 15 and Workstation Pro 14, and can definitively state that the SCSI drive option continues to be faster than the NVME drive option, and both are much slower than they should be within VMs.  (Even under the SCSI option, read/write performance is DRAMATICALLY worse in the VM than on the host for NVME drives.)  I tested using a Samsung 970 Evo 2TB NVME drive in all cases for this.  There is not even a recognizable marginal improvement in NVME performance between Workstation 14 and 15.  So, VMware has yet to actually address this problem.  Notably, Workstation 15 continues to recommend SCSI drives over all others for performance, so at least that much is yet true.  NOTE: Within a VM, NVME drives particularly suffer for multithreaded reads and writes, as may be seen in the AS SSD Benchmark tool's 4K-64Thrd (i.e., 64-threads) read and write tests.

                   

                  There are very few reasons to upgrade to 15 Pro from 14 Pro IMO.  Even 4K support is questionable: one can achieve workable 1080P across 1080P and 4K screens with Windows 10 (1809) for example.  (Just set the host's resolution to 1080P before launching Workstation 14 Pro, AND ensure that you have set the "high DPI settings" appropriately: using the Properties for the Workstation Pro shortcut, go to the Compatibility tab, then "Change high DPI settings," and select "Program DPI" with "I signed in to Windows" for the corresponding dropdown option.)  This said, the 4K support is nice if you do decide to fork out the bucks for 15 Pro.  If you are using 15 Pro, you may also want to enable (check) "Automatically adjust user interface size in the virtual machine" (under VM --> Settings --> Hardware --> Display).

                   

                  Now, regarding overall VM performance, including drive performance, here are your best options to date: use a SCSI disk (NOT NVME), select (enable) the "Disable memory page trimming" option under VM --> Settings --> Options --> Advanced, and select "Fit all virtual machine memory into reserved host RAM" under Edit --> Preferences --> Memory.  (For this last option, you will want to ensure you have plenty of physical host RAM that you can spare (i.e., dedicate exclusively) to the VM (should the VM need it).  Lastly, if you are using 15 Pro, you may want to set Graphics memory (under VM --> Settings --> Hardware --> Display) to 3GB.

                   

                  Finally, if you are working on a system that has hybrid graphics (e.g., a laptop with an nVidia video card besides in-built Intel display graphics), you may want to use the relevant "control panel" application to instruct your system to use the discrete graphics card for vmware.exe and vmware-vmx.exe.

                  • 7. Re: SSD Performance Degradation in VMWare Workstation Guest
                    roblevine Novice

                    Agree 100%.

                     

                    SCSI performs better than NVMe, but even SCSI is an order of magnitude slower than the host; it never used to perform like this. Something happened around Workstation 14 or so that brutally hammered disk IO.

                     

                    I'm not hearing anything from VMWare on this any more - I suspect they aren't addressing it.

                    After 10-15 years of doing all my software development in a VMWare Workstation, I've finally given up and stopped using it. I'm working directly off the host these days. A real pity - there is so much to commend doing my sort of work in a VM, but using this product just isn't viable for me any more.