Disk latency reporting error in vSphere 5

cmbwml1 · ‎01-17-2012

On some of our disk performance graphs I am seeing the highest latency metric recording almost 200 million milliseconds. That equates to over 50 hour latency spikes. The storage isn't reporting any significant latency. The vSphere client shows these latency spikes on the VM's datastore and disk performance graphs but not for the virtual disk. The host is showing 6 second maximum command latency but nothing else above 50 ms.

Could this be a VAAI primitive issue? Simple performance metric calculation error?

Thanks,

Chris

BharatR · ‎01-17-2012

Hi,

The vSphere client shows these latency spikes on the VM's datastore and disk performance graphs

but not for the virtual disk

Rescan of all storage adapters so if it is Causing by any dead LUN can be removed from the configuration

It happens when hosts were trying to connect to a dead LUN/path

Here is the Guide for highestlatency parameter for KB.

http://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/disk_counters.html

Best regards, BharatR--VCP4-Certification #: 79230, If you find this information useful, please award points for "correct" or "helpful".

PUREJOY · ‎01-18-2012

Not really.

This is a legitmate problem in your system, you may have to isolate the VMs causing the spike.

I will start with the backend lun configuarion, move onto ESX configuration like Multipath config (SATP and Path policy) and finally see if the VMs are doing some kind of burst IOs.

In all cases, i would use esxtop/resxtop to gather disk data (option u,v,and d)

let us know if you see something fishy

--Ravi

Architect @ Pure Storage || www.purestorage.com || http://www.purestorage.com/blog/ || http://twitter.com/#!/purestorage ||@ravivenk || VCAP-DCA5, VCP 4, VCP 5

MK22 · ‎02-15-2012

This has been happening to me ever since I upgraded 2 of our datacenters to vSphere 5. It is impossible to have 55 hours of latency in a span of 20 seconds, so I'd say this has to be something wrong with the statistics math on the data that is in the database, even if the stats lagged beind on a change of path, it is impossible. The machine in this screenshot has had a maximum of < 1ms of latency but has been caclulated out to 55 hours. Have you found anything out? I've done numerous searches, but I think I'm going to have to open a service request.

VCP

MK22 · ‎02-15-2012

Oh good, I finaly found something...

It's a known issue.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2012309&sl...

VCP

SteveBeal · ‎03-16-2012

Having the same issue. However when I look at esxtop everything appears to be fine.

MK22 · ‎03-16-2012

Yeah, that's the workaround they've presented, I guess we'll have to wait until 5.1 to have this fixed....

Resolution

This is a known issue.

To work around this issue, use ESXTOP to accurately measure the disk latency. You can also use a third party utility in the guest operating system to measure disk latency.

VCP

AdamKski · ‎02-25-2013

I'm seeing these same issues in 5.1. Anyone else??

All

Disk latency reporting error in vSphere 5

Resolution