VMware Cloud Community
Michael6835
Contributor
Contributor

Multipathing Status

Hello,

I have 2 HP DL380G6 servers connected to a HP MSA 2312i san. The controllers are active/active and work with ULP.

I have 2 separate HP switches (Procurve 1800) on separate networks dedicated to iscsi only.

I am using the ESX iscsi initiator and have configured two separate vswitches to each switch. I have binded the the two separate vmk nics to the iscsi hba.

All is working, I was just going through the config and noticed that under the storage views I am seeing Partial/ No Redundancy.

I am wondering why that is the case.

I have a theory, but I wanted some other eyes to look at it.

The esx has access to both switches which are linked to both controllers.

My guess is since it's active/active and ULP, all 4 paths are being presented to the host under the same iscsi target name (different ip's , same name)

I figure since it has the same name, it feels there is no redundancy ? Is that right?

I believe it's redundant cause under the multipathing view, I see two active (I/O), one from each subnet on the owning controller. My understanding of ULP is ULP exposes all LUNs through all host ports on both controllers. ULP appears to the host as an active-active storage system where the host can choose any available path to access a LUN regardless of vdisk/LUN ownership.

Can some one shed some light?

thanks

Attached are images of the different areas.

Reply
0 Kudos
4 Replies
Sly
Contributor
Contributor

Michael,

We have an almost identical hardware set up as you have and from looking at the screenshots, the network config for the storage is the same too. The only difference is we are using HP 2910al switches. We are also seeing the Partial/No Redundancy listed. Our failover is working, we can pull any of the 4 connections to the SAN and VMware determines a new path, but I can only ever get one live connection at a time, which I think means that only one controller is being used at a time instead of two controllers. What I think we need to set up in order to get full redudancy is to multipath across both controllers. This is where things are unclear for me and there is no HP documentation on setting up vSphere with the MSAs. How do you make both controllers actively handle the data at the same time? Do you need to create a trunk or what Cisco folks call an etherchannel? The HP 1800 and 2910al switches that you and I both have are not capable of what HP calls "distributed trunking" where you make the two switches aware on each other and then create a trunk across the two switches. Maybe the other option to confirm this is as the cause is to route all of your connections through one switch, and create two trunks and see if your redundancy goes to full. I hope my comments help.

I would appreciate it if someone out there who is an expert with iSCSI multipathing to an active-active SAN could help us out. Is my assumption to the cause of this partial redundancy correct? Does any one out there have the HP MSA 2000 series with full redundancy working with vSphere? If so what are we doing wrong?

Reply
0 Kudos
matthiaseisner
Hot Shot
Hot Shot

Hi,

I don't think that a MSA2xxx is able to handle real active-active. Only high end storages are able to do such things. The LUN is owned by just one SP at a time. I think that the ESX server can't see redundancy if u use an iSCSI storage system. If your owing SP fails the other one is able to take the LUN over (trespass). U only have one active connection for I/O. In your case two because u use two vmkernel ports, but the storage traffic will not use both connections at the same time for the same traffic (VM). The only option to implement such stuff is 802.3ad (link aggregation). Using this u can use two or more connections simultaneously for traffic, but all of this traffic will go trough one SP.

I hope this helps. If not just ask.

Rgds

Matthias

Reply
0 Kudos
Michael6835
Contributor
Contributor

Hi,

If I am understanding the description from HP, it all has to do with how the storage controller is presenting the luns to the hosts. See the attached pdf which is from a techbook on the MSA that explains the ULP. Apparently the ULP shows up as the same same to the hosts thus the reason why it thinks it doesnt have another failover waiting. One controller owns the luns, however all 4 host ports are used, the information is passed to the other controller.

Can anyone else shed some light, perhaps someone from HP who might peruse these boards?

Reply
0 Kudos
Sly
Contributor
Contributor

Regarding the partial redundancy, after I replied yesterday, the team that I was working with to configure our new vSphere setup continued to do quite a bit of digging on whether or not we had everything implemented correctly. The consultant we are working with called one of his colleagues and determined that the reason it is indicating partial redundancy is because we have each of the four connections going to their individual vSwitch, just like you have in your original screenshot. WE have it set up this way because during our testing we found that the HP MSA overall responsiveness was slower and failover didn't seem to work properly when we had two NICs on a vSwitch. After reconfiguring it to have eacch NIC on its own vSwitch our failover worked very reliably but all traffic seemed to be going through only one of the two NICs on Controller A. Then after changing path selection to round robin the traffic was split across the two NICs on controller A going across both switchs; controller B's NICs were still unused. I think my understanding on this MSA is similar to what was previously mentioned. That the active-active ULP controller configuration is not like other SANs. For this SAN acctive active ULP means that both controllers are always presenting all of the LUNs on all 4 connections and internally both controllers have the read and write cache mirrored with each other so a failover can be more immediate. We have only one vDisk and 4 LUNs on that vDisk and according to the documentation and what I see on the service console web GUI you can assign which controller owns which vDisk but only at the vDisk level, not the LUN level. Even though the attached HP presentation seems to indicate that the two different controllers will split up their activity based on the LUN #, we are currently only seeing traffic coming in on one controller. Unless there is a controller interconnect I don't understand how the two controllers would be sharing the load of one vDisk.

Does someone out their have an MSA with more than one vDisk? Are you able to make controller B the primary on one and controller A the primary on the other?