VMware Cloud Community
jhboricua
Enthusiast
Enthusiast

Strange Read Latency issue on select VMs, need help with troubleshooting

I have a setup that it is as follows:

4 hosts in a Dell VRTX Chassis with 25 SAS 10k Drives

Running esxi 5.5U1

1 VMFS5 datastore provided by 24 SAS drives on VRTX. These drives have been configured as a single RAID 10 virtual disk. Connectivity to 4 hosts is SAS.

I'm also testing an EMC XtremIO flash array against this cluster. Connectivity to the XtremIO VMFS volume is via 10Gb iSCSI. Round Robin PSP, iops=1 per EMC documentation for XtremIO.

On this cluster I have a little over 100 VMs. They are all thin provisioned as it is a dev/test cluster. As I've mentioned, we are testing the XtremIO against this cluster to get a sense of what sort of deduplication and compression we can realistically get in our environment. As such I presented a VMFS volume from the XtremIO to the 4 hosts via iSCSI using the vSphere software iSCSI adapter.

The issue is as follows:

I setup a PowerCLI script to svMotion the running VMs from teh VRTX datastore to the XtremIO datastore one VM at a time, to gather some stats. During this process we noticed that on certain select VMs, about 5 or 6 out of the 100+ in the cluster, the svMotion transfer rate would plummet and the read latency would climb and remain above 500 ms or higher (as high as 800ms sometimes). Those were the numbers reported by the ESXi host where the VM in question lived at the time it was svmotioned. The VM of course was almost unresponsive. But the latency issue seems to be constrained ONLY TO THAT VM. Meaning if I went to the console of another VM in the same VRTX datastore it would perform just fine. Even if I launch the console of a VM in the same datastore AND the same host as the VM with the high read latency issue, it performs fine. So only the VM in question being svMotioned is unresponsive.

Here is where the plot thickens. The problem also manifests when svMotioning these select VMs in the opposite direction (XtremIO to VRTX), with the same high read latency now being reported on the hosts as coming from the XtremIO flash array.

Furthermore, if I shutdown the VMs in questions and then I svMotion them, there's no issue at all. So it's only happening while these particular VMs svMotioned while running.

Any thoughts??

Reply
0 Kudos
1 Reply
chriswahl
Virtuoso
Virtuoso

What sort of configuration and workload do the VMs that experience latency look like?

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos