VMware Cloud Community
Ugur_Demirciogl
Contributor
Contributor
Jump to solution

ESXI 6.0 round robin multipathing latency problem

Hi All,

I am using ESXi 6.0  and ı have changed my storage system. So when ı move all vm datastore to new storage ı faced latency issue, ı have tried to find out that problem and I figure out that when ı change path policy from round-robin to fixed path the latency becomes normal. I want to use round-robin policy but at this point ı cant, has anyone experienced the same problem before? or I would be glad if someone shares an idea about the subject. Thanks.

1 Solution

Accepted Solutions
Ardaneh
Enthusiast
Enthusiast
Jump to solution

Hi

First of all, your storage must support RR, so check this first. Keep it in mind that round-robin only stopped using a path when it was dead. so If the latency on one path was 50 ms and 1 ms on the other one, it would see and use each path equally.

When you are using multipathing and you have this kind of issue, you need to validate all physical path components (cables, Modules, Ports).

In ESXi 6.7 there is a specific storage policy for RR, When enabled, ESXi will sample the paths every 3 minutes with 16 I/Os. It will then calculate the average latency for those I/Os and decide (in comparison to the other paths) whether or not to use that path. If it is deemed too unhealthy, it will be excluded until the next sampling period begins in 3 minutes where it will be re-evaluated.

You can find more information about how to config this policy in this link.

Hope this could be helpful

View solution in original post

9 Replies
MikeStoica
Expert
Expert
Jump to solution

You have iSCSI storage? If yes, have you checked this Best Practices For Running VMware vSphere On iSCSI | VMware ?

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

If the above guide doesn't help, you should actually reach out to your storage vender or at the very least let us know what that vendor is, as each has different requirements that you need to follow. For example nimble arrays have a plugin you need to install that handles the multipath options for your, and other vendors have similar.

Reply
0 Kudos
Ugur_Demirciogl
Contributor
Contributor
Jump to solution

Hi Mike,

I am using FC protocol not ISCSI , buy the way ı gave datastore as ISCSI just for test, So on ISCSI protocol no any issue, but ı need to use FC

Reply
0 Kudos
Ugur_Demirciogl
Contributor
Contributor
Jump to solution

I am using FAS Series Netapp Storage

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

Contact Netapp and let them know, if your doing fixed and it works, one of the paths are causing problems. We saw something similar when a sfp went bad which caused the light levels to drop out of range. There isn't much on VMware's side configuration side you can do.

Reply
0 Kudos
andres_prieto_a
Contributor
Contributor
Jump to solution

Hi

i would suggest, like other has done, to contact with vendor and make sure what is the best policy to apply for storage. Have in mind that needs to be aligned between the Storage and the ESXi otherwise as you are experience can cause problems: because storage is expecting one policy and vmware is using other.

We suffer this one with EMC in the past

Regards

Reply
0 Kudos
Ardaneh
Enthusiast
Enthusiast
Jump to solution

Hi

First of all, your storage must support RR, so check this first. Keep it in mind that round-robin only stopped using a path when it was dead. so If the latency on one path was 50 ms and 1 ms on the other one, it would see and use each path equally.

When you are using multipathing and you have this kind of issue, you need to validate all physical path components (cables, Modules, Ports).

In ESXi 6.7 there is a specific storage policy for RR, When enabled, ESXi will sample the paths every 3 minutes with 16 I/Os. It will then calculate the average latency for those I/Os and decide (in comparison to the other paths) whether or not to use that path. If it is deemed too unhealthy, it will be excluded until the next sampling period begins in 3 minutes where it will be re-evaluated.

You can find more information about how to config this policy in this link.

Hope this could be helpful

Ugur_Demirciogl
Contributor
Contributor
Jump to solution

Hi,

Yesterday, I have check all physical path components ı found that on brocade san switch one port have so many crc error ı have replaced that port fc cable than round robin work well.

thank you all.