Hi all,
I have a Pure Storage SAN configured with 4 x 25G Ethernet adapters - 2 Controllers with 2 ports on each.
My VM-Hosts have dedicated 25G Ethernet adapters (2 ports) which have been configured in a single vSwitch which has 2 Port Groups with a single card in each.
The iSCSI Initiator has been configure and the 2 Port Groups added.
All configuration looks good and my VM-Hosts can connect to the storage.
During my testing of the Multipathing I have noticed that if I disconnect one of my iSCSI ports my VM stops writing to disk for around 40 seconds. After it recovers it's fine.
I have performed this test on both my iSCSI interfaces ad experience the same issue.
My questions are:
1. Is this expected behavior? - I don't see this with Fibre Channel
2. Is there anything that can be done reduce the recovery timeout when I loose one of my iSCSI adapters?
Any help would be greatly appreciated as I'm really struggling to find information on anticipated behavior of VMware when a host loses an iSCSI path
Thanks
J
Update
I have had confirmation from VMware and my SAN provider (PureStorage) that a recovery time of 35 seconds is to be expected when a path fails over iSCSI.
This really surprised me as I thought Multipathing would have simply marked the dead path as down and carried on sending packets over the other path(s). What particularly surprised me is that even in an Active/Active configuration - I can see an even distribution of iops on the SAN interfaces - that a failed path just halts all writes to the SAN for that Host while the systems tests the dead path.
I have spent 2 days testing and researching this behavior before reaching out to VMware and PureStorage.
I hope this updated post saves others from wasting as much time as I have.
In short - iSCSI multipathing will halt write to the SAN for upto 35 seconds (with defaults set) when an iSCSI path goes down.
---------------------------------------------------------------------------------------------------------
Was it helpful? Let us know by completing this short survey here.
it seems to vary between adapters, I have seen failovers occur within 20 seconds, sometimes it take 30 seconds etc. I guess that is the downside of this configuration, my guess is that a single port group with 2 NICs would display a different behavior, but I don;t have the option here to test it for you.
Thank you for your reply Depping.
It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.
I have a case open with VMware and will report back their suggestions. So far they have advised that the time to recover should not generate a freeze.
Update
I have had confirmation from VMware and my SAN provider (PureStorage) that a recovery time of 35 seconds is to be expected when a path fails over iSCSI.
This really surprised me as I thought Multipathing would have simply marked the dead path as down and carried on sending packets over the other path(s). What particularly surprised me is that even in an Active/Active configuration - I can see an even distribution of iops on the SAN interfaces - that a failed path just halts all writes to the SAN for that Host while the systems tests the dead path.
I have spent 2 days testing and researching this behavior before reaching out to VMware and PureStorage.
I hope this updated post saves others from wasting as much time as I have.
In short - iSCSI multipathing will halt write to the SAN for upto 35 seconds (with defaults set) when an iSCSI path goes down.
---------------------------------------------------------------------------------------------------------
Was it helpful? Let us know by completing this short survey here.
It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.
I understand that that isn't possible, I was suggesting doing this without portbinding.