VMware Cloud Community
MoneyChang
Enthusiast
Enthusiast

Datastore High Latency on a Host with no VM

We have 15 LUN all shared to 7 hosts. We noticed that only one host has bad disk performance (high datastore write latency). The host has very light loading (no VM on it) while others have heavier loading than it. However, only the host has disk performance issue (with 30 - 40 ms write latency to those LUN but at the same time the other hosts have no latency issue). I can't figure out why only the host has the disk performance issue for the same LUN..

Latency Issue.jpgLatency Issue-Normal.jpg

ClusterLevell.jpgHostLevell.jpg

Tags (1)
Reply
0 Kudos
5 Replies
Sateesh_vCloud

Couple of points if already isolate the Latency generator:

ESXi host to LUN paths - how many paths? Are they distributed equally?

ESXi host - which policy is configured for this LUN?  RR MRU Fixed?

                  which policy configured for other LUNs working good? (on same host?)

ESXi host to LUN - we have multiple points like VMkernel - HBA card - FC switch - Storage Processor - Disk

If you suspect first half in the sequence look for driver version and monitor ESXTOP output for checking actual delay

Is it happening all the day? or only for specific period of time?

Who else can access this LUN?  Backup software - if yes how other hosts are optimally configured to avoid such latency?

Hope this analysis helps....

------------------------------------------------------------------------- Follow me @ www.vmwareguruz.com Please consider marking this answer "correct" or "helpful" if you found it useful T. Sateesh VCIX-NV, VCAP 5-DCA/DCD,VCP 6-NV,VCP 5 DCV/Cloud/DT, ZCP IBM India Pvt. Ltd
Reply
0 Kudos
MoneyChang
Enthusiast
Enthusiast

Thank you for the reply!

ESXi host to LUN paths - how many paths? Are they distributed equally?    

>>> 8 paths in total (4 is active and 4 is for redundancy purpose)

ESXi host - which policy is configured for this LUN?  RR MRU Fixed?  

                  which policy configured for other LUNs working good? (on same host?)     

>>> All the LUNs in our env. are configured  to RR ( it is not particular datastore has latency issue on the host, but it's randomly happened to many datastores)

ESXi host to LUN - we have multiple points like VMkernel - HBA card - FC switch - Storage Processor - Disk   

>>> VMkernel - 10Gb Network Card - Switch - SVC - Storage

If you suspect first half in the sequence look for driver version and monitor ESXTOP output for checking actual delay    

>>> I have issued a SR to VMware, and VMware recommends me to upgrade the driver & firmware of NIC to latest version...

However, i don't think this is the root cause since we have 14 hosts with exactly the same configuration but only one host has this issue.

It is very hard to monitor in esxtop. As the picture shows (the first post), it is rising in one spot and then dropping back to normal.

When I monitor in esxtop, the issue just doesn't happen..

Is it happening all the day? or only for specific period of time?    

>>> All day, but I don't see any rule, looks like randomly.

Who else can access this LUN?  Backup software - if yes how other hosts are optimally configured to avoid such latency?    

>>> No, there is no backup software. The hosts in the same cluster all can access the LUNs but only one host has this issue.

Is there any other clue?

Reply
0 Kudos
brunofernandez1

what kind of storage env do you use? i would check the HCL if RR is supported with your storage

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
MoneyChang
Enthusiast
Enthusiast

To brunofernandez1: Thank  you. I've check with our storage vendor and RR is supported in our storage.

I've also noticed that it is the master host in the HA enabled cluster having the latency issue.

Is it because of the master role of the host among the HA enabled cluster?

Reply
0 Kudos
MoneyChang
Enthusiast
Enthusiast

Status update: This issue has been processed by VMware development team.

Reply
0 Kudos