VMware Cloud Community
toprock
Contributor
Contributor
Jump to solution

iSCSI Multipathing Recovery Times

Hi all,

I have a Pure Storage SAN configured with 4 x 25G Ethernet adapters - 2 Controllers with 2 ports on each.

My VM-Hosts have dedicated 25G Ethernet adapters (2 ports) which have been configured in a single vSwitch which has 2 Port Groups with a single card in each.

The iSCSI Initiator has been configure and the 2 Port Groups added.

All configuration looks good and my VM-Hosts can connect to the storage.

During my testing of the Multipathing I have noticed that if I disconnect one of my iSCSI ports my VM stops writing to disk for around 40 seconds. After it recovers it's fine.

I have performed this test on both my iSCSI interfaces ad experience the same issue.

My questions are:

1. Is this expected behavior? - I don't see this with Fibre Channel

2. Is there anything that can be done reduce the recovery timeout when I loose one of my iSCSI adapters?

Any help would be greatly appreciated as I'm really struggling to find  information on anticipated behavior of VMware when a host loses an iSCSI path

Thanks

J

0 Kudos
1 Solution

Accepted Solutions
toprock
Contributor
Contributor
Jump to solution

Update

I have had confirmation from VMware and my SAN provider (PureStorage)  that a recovery time of 35 seconds is to be expected when a path fails over iSCSI.

This really surprised me as I thought Multipathing would have simply marked the dead path as down  and carried on sending packets over the other path(s). What particularly surprised me is that even in an Active/Active configuration - I can see an even distribution of iops on the SAN interfaces - that a failed path just halts all writes to the SAN for that Host while the systems tests the dead path.

I have spent 2 days testing and researching this behavior before reaching out to VMware and PureStorage.

I hope this updated post saves others from wasting as much time as I have.

In short - iSCSI multipathing will halt write to the SAN for upto 35 seconds (with defaults set) when an iSCSI path goes down.

---------------------------------------------------------------------------------------------------------

Was it helpful? Let us know by completing this short survey here.

View solution in original post

4 Replies
depping
Leadership
Leadership
Jump to solution

it seems to vary between adapters, I have seen failovers occur within 20 seconds, sometimes it take 30 seconds etc. I guess that is the downside of this configuration, my guess is that a single port group with 2 NICs would display a different behavior, but I don;t have the option here to test it for you.

0 Kudos
toprock
Contributor
Contributor
Jump to solution

Thank you for your reply Depping.

It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.

I have a case open with VMware and will report back their suggestions. So far they have advised that the time to recover should not generate a freeze.

0 Kudos
toprock
Contributor
Contributor
Jump to solution

Update

I have had confirmation from VMware and my SAN provider (PureStorage)  that a recovery time of 35 seconds is to be expected when a path fails over iSCSI.

This really surprised me as I thought Multipathing would have simply marked the dead path as down  and carried on sending packets over the other path(s). What particularly surprised me is that even in an Active/Active configuration - I can see an even distribution of iops on the SAN interfaces - that a failed path just halts all writes to the SAN for that Host while the systems tests the dead path.

I have spent 2 days testing and researching this behavior before reaching out to VMware and PureStorage.

I hope this updated post saves others from wasting as much time as I have.

In short - iSCSI multipathing will halt write to the SAN for upto 35 seconds (with defaults set) when an iSCSI path goes down.

---------------------------------------------------------------------------------------------------------

Was it helpful? Let us know by completing this short survey here.

depping
Leadership
Leadership
Jump to solution

It's not possible to have 2 ports in a Port Group that has a Port Binding for the iSCSI Initiator.

I understand that that isn't possible, I was suggesting doing this without portbinding.

0 Kudos