4 Replies Latest reply on Jul 4, 2019 12:42 AM by depping

    iSCSI Multipathing Recovery Times

    toprock Lurker

      Hi all,

       

      I have a Pure Storage SAN configured with 4 x 25G Ethernet adapters - 2 Controllers with 2 ports on each.

       

      My VM-Hosts have dedicated 25G Ethernet adapters (2 ports) which have been configured in a single vSwitch which has 2 Port Groups with a single card in each.

      The iSCSI Initiator has been configure and the 2 Port Groups added.

       

      All configuration looks good and my VM-Hosts can connect to the storage.

       

      During my testing of the Multipathing I have noticed that if I disconnect one of my iSCSI ports my VM stops writing to disk for around 40 seconds. After it recovers it's fine.

      I have performed this test on both my iSCSI interfaces ad experience the same issue.

       

      My questions are:

      1. Is this expected behavior? - I don't see this with Fibre Channel

      2. Is there anything that can be done reduce the recovery timeout when I loose one of my iSCSI adapters?

       

      Any help would be greatly appreciated as I'm really struggling to find  information on anticipated behavior of VMware when a host loses an iSCSI path

       

      Thanks

      J

        • 1. Re: iSCSI Multipathing Recovery Times
          depping Champion
          User ModeratorsVMware Employees

          it seems to vary between adapters, I have seen failovers occur within 20 seconds, sometimes it take 30 seconds etc. I guess that is the downside of this configuration, my guess is that a single port group with 2 NICs would display a different behavior, but I don;t have the option here to test it for you.

          • 2. Re: iSCSI Multipathing Recovery Times
            toprock Lurker

            Thank you for your reply Depping.

             

            It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.

             

            I have a case open with VMware and will report back their suggestions. So far they have advised that the time to recover should not generate a freeze.

            • 3. Re: iSCSI Multipathing Recovery Times
              toprock Lurker

              Update

               

              I have had confirmation from VMware and my SAN provider (PureStorage)  that a recovery time of 35 seconds is to be expected when a path fails over iSCSI.

               

              This really surprised me as I thought Multipathing would have simply marked the dead path as down  and carried on sending packets over the other path(s). What particularly surprised me is that even in an Active/Active configuration - I can see an even distribution of iops on the SAN interfaces - that a failed path just halts all writes to the SAN for that Host while the systems tests the dead path.

               

              I have spent 2 days testing and researching this behavior before reaching out to VMware and PureStorage.

               

              I hope this updated post saves others from wasting as much time as I have.

               

              In short - iSCSI multipathing will halt write to the SAN for upto 35 seconds (with defaults set) when an iSCSI path goes down.

               

              ---------------------------------------------------------------------------------------------------------

              Was it helpful? Let us know by completing this short survey here.

               

               

              • 4. Re: iSCSI Multipathing Recovery Times
                depping Champion
                VMware EmployeesUser Moderators

                It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.

                 

                I understand that that isn't possible, I was suggesting doing this without portbinding.