According to esxtop screenshot it's not looking bad. You have to check if You can expect better performance from HDDs installed. Read the specs of disks You are using.
VMkernel is not slowing the write process down (KAVG/cmd).
This is the latency seen at the device driver level. It includes the roundtrip time between the HBA and the storage.
DAVG is a good indicator of performance of the backend storage. If IO latencies are suspected to be causing performance problems, DAVG should be examined. Compare IO latencies with corresponding data from the storage array. If they are close, check the array for misconfiguration or faults. If not, compare DAVG with corresponding data from points in between the array and the ESX Server, e.g., FC switches. If this intermediate data also matches DAVG values, it is likely that the storage is under-configured for the application. Adding disk spindles or changing the RAID level may help in such cases.
The SATA HDs used for the VM Host and the physical PC is the same Seagate SATA model.
So I guess I should either be looking at BIOS settings of the VM Host and/or settings for the Windows 10 VM ?
Are there any more tweaks or statistics I can look at from the ESXi toolset that would help pin-point the issue, now that we've determined it's not the virtual drive driver?
Focus on the hardware level. Find the latency, speed and interface specification of the disks You have.
As an example here You can read the avg latency You can expect and try to get closer to this number tweaking the configuration parameters.
Add more disks and RAID controller to create RAID group. If You have 2-3 hosts consider to add SSD and implement vSAN solution.
Consider the screenshot as a baseline making any change to the configuration compare it with the baseline to get know if You are improving or not. Make one change at a time and compare with the previous results. ESXTOP is a sufficient tool to observe the results.
If You stack with hardware the You can proceed with VM virtual adapter tweaking. Use PVSCSI as the best in most configurations.
There was (maybe still exists) a problem with the SATA AHCI driver in ESXi 6.5. So if that problem is not resolved in ESXi 6.7 that might be the cause of the poor disk performance.
Have a look at this thread post Re: Very slow speed on SSD
Appreciate the input bluefirestorm.
If you are referring to the native VM driver issue which caused storage performance issue in ESXi 6.5 (Anthony Spiteri), it was resolved in ESXi 6.5 U1 release.
I don't think I want to start introducing more disks and RAID into the equation as it really is not my goal at this point.
I just really want to understand why a physical PC with lower performance hardware can outperform (albeit a slow file append operation may not qualify as a complete performance failure) a VM Host with higher performance hardware, running one VM (configured as close as I know how to the physical PC):
- Windows 10 Pro v1703 (physical) versus Windows 10 Pro v1709 (VM)
- storage controller: SATA IDE (physical) versus SATA ACHI ( SCSI Controller in VM)
- Same physical Seagate SATA drive - Barracuda 7200.12, ST31000528AS
If you say that ESXi is showing that it is not introducing much latency, then what is causing the performance difference?
Did I miss something in the BIOS setting of the VM Host? Or did I incorrectly install/configure the VM Host or VM, or have I not customized the Windows 10 on the VM for better performance?
Will look into VM virtual adapter tweaking as you suggested - do you have a specific link in mind?
Will look into using Paravirtual SCSI adapters - VMware Knowledge Base nicely documents how to configure an existing Windows boot disk to use PVSCSI adapter.
Windows 10 Pro v1703 (physical) versus Windows 10 Pro v1709 (VM)
Does both physical and virtual have the Meltdown patch?
Meltdown patch can have a bad performance side effect for both physical and virtual machines that have an I/O intensive operation (which I think a file append would be). The mitigation against this performance side effect requires that the CPU is Haswell generation or newer. You don't indicate what CPUs you have for the physical Windows 10 and the ESXi host. ESXi requires virtual machine version 11 or higher so that Haswell instructions are exposed to the VM. ESXi 6.7 would be using version 14 by default though.
Is the VM virtual disk thick provisioned?
Also try bumping the number of vCPUs in the VM from 1 to 2.
Server's hardware is often designed to last as long as possible, to guarantee a stable work and to give tweaking possibilities. The expensive ones give the higher performance ratios too.
To further investigate performance issues and optimize vDisk performance check if AHCI Controller firmware matches driver version in ESXi 6.7. Maybe hardware's flash needs to be updated or a host's driver.
To optimize write operations in VM create a eager-zeroed thick vDisks.
I did a bit more reading of the use of PVSCSI, and VMWare KB article 1010398 states that "PVSCSI adapters are not suitable for DAS environments". So I'm not going to follow up on this for now.
You guys hit it on the head!
The VM was using thin-provisioned disk! In hindsight, it makes senses - as the data is being written, the overhead must be for formatting the disk on the fly.
I ran the same file append app (75GB transfer) on one VM with eager thick, and one VM with lazy thick:
- eager thick - 13 mins
As you guys said, this is expected...
- lazy thick - 18 mins
That's more inline with my expectations of disk performance (versus 15 mins on the slower physical PC, and 37 mins for the thin-provisioned VM).
I also needed to perform above tests after snapshots was taken:
- eager thick - 25 mins
- lazy thick - 23 mins
Again, I can imagine the performance being worse, because the snapshot causes the changing/changed data to be written to the snapshot repository, and that means it has to perform formatting on the fly (as in the case of thin-provisioned). Still, the values are better than the 37 mins.
I will to see if there are any snapshot repository advanced configuration parameters that can be used to improve that latency. Barring that, I have the fallback of using SSDD(s) instead of spindle drives. I don't think I have any applications that will hit the disk that hard, but it was a good discovery to learn about these limitations.
Appreciate all your help! Thanks!
- eager thick - 13 mins