VMware Cloud Community
patters98
Enthusiast
Enthusiast

vSphere Guest iSCSI multipathing failover not working

The failure seems to be at the vSphere networking level though, so it doesn't look like a Windows nor SAN vendor problem.

I have built a two node ESX4i (4.0.0U1) cluster using an EqualLogic PS4000XV which is connected as per the EqualLogic best practice documents. Namely, on each node:

. The vSphere software initiator is bound to four separate vmkernel ports (one per path is recommended).

. The iSCSI vSwitch uses two physical NICs, one into each dedicated iSCSI network switch (PowerConnect 5424).

. EqualLogic say you should bind half of the iSCSI vmkernels to a single pNIC, with the other pNIC configured as unused, and the other half of the iSCSI vmkernels should reverse that arrangement..

. Jumbo frames are enabled on the iSCSI vSwitch on each ESXi node at creation time, on the vNICs, and on the physical network switches.

. The two physical iSCSI switches use flow control, and have a 6Gbps link aggregation trunk between them.

So far so normal. This all works and vSphere fails over its storage connections as expected when cables are disconnected.

The problem is with guest VMs. For some of these I need to connect iSCSI LUNs using the Microsoft iSCSI Initiator within the VM so I can use Exchange- and SQL-aware off-host backup using Backup Exec - which leverages the EqualLogic Hardware VSS provider. For this I need to create two additional Virtual Machine Port Groups ("VM iSCSI 1" and "VM iSCSI 2") attached to the same vSwitch the vSphere iSCSI initiator is using, and create a vNIC in each (see screenshot):

As with those vmkernel ports, one is bound to the 1st pNIC with the 2nd pNIC unused, and the other Virtual Machine Port Group is configured in the opposite way. In the Guest OS (Windows 2003R2) the EqualLogic Host Integration Tools (3.3.1) see the two adapters and enable multipathing. The tools show an iSCSI connection to the SAN from each adapter.

What I discovered recently when I needed to do some maintenance is that if I disconnect one of the iSCSI pNICs the vSphere iSCSI connections fail over fine, but those iSCSI sessions that originate within the VMs do not. In the failed state I could not ping the SAN from these VMs. Since production systems were down I couldn't afford time to troubleshoot so I had to reconnect the physical cable and reboot them. All three VMs with Guest-attached LUNs failed in this way. I have of course double-checked the Networking configs on both ESXi cluster nodes. I am soon going to move all VMs onto one node for some unrelated maintenance, so I shall be able to test in more detail.

So, has anyone else encountered this behaviour? Is there perhaps a better way I should configure these VMs for iSCSI multipathing?

0 Kudos
44 Replies
Aceh_King
Contributor
Contributor

I can confirm the above statement works as I have done it many time of NetApp Filers.

0 Kudos
J_R__Kalf
Contributor
Contributor

Hey Patters,

did you find any time to test the suggestions?

Give us a followup on your situation..

Jelle

--- VMware VCP since 2006 ---

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

--- VMware VCP since 2006 --- If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
patters98
Enthusiast
Enthusiast

I haven't had any more testing time unfortunately. My Dell EqualLogic case was escalated and they're saying they don't support Guest iSCSI sharing a vSwitch or physical NIC with vmkernel iSCSI which I'm not happy with at all.

I've raised a VMware support request too so we shall see...

Can what I'm trying to do really be so unusual? I'd say most customers will want to do this.

0 Kudos
patters98
Enthusiast
Enthusiast

VMware have finally come back with the same opinion as Dell - that I need to use a separate vSwitch for it to be supported, so two more physical NICs.

Their tech said I could set up the guest with 1 vNIC and have that VM port group configured to use both physical NICs in an active/standby config. This would take care of the failover at layer 2 instead of relying on the Eql MPIO driver, however I would like the guest VMs to have true multipath since I want the max. I/O possible (2Gbps).

I may not have time to test soon because I have a week of holiday, then a week on training - to become VCP Smiley Happy

0 Kudos
william_urban_D
Enthusiast
Enthusiast

Hi Patters98,

I just now saw this and am trying to follow the process of what's going on. iSCSI in the VM is treated exactly as normal TCP/IP traffic. ESX doesn't know the difference. (I mean it does but not really for our purpose) Here is what we do here in our solutions lab. Delete the two iSCSI1/iSCSI2 (EDIT: not the actual vmk iSCSI1/2 I meant the two you created for the guest) in vSwitch. Create a single vSwitch called Guest iSCSI. Leave the defaults or configure them as you do normal network traffic (ignore the iSCSI stuff for now for that particular VM Network). So just glancing at your config I would create "Guest iSCSI" VM network in that vSwitch. Then I would assign the VM two virtual nics and both of them are assigned to the Guest iSCSI. Install the HIT and MPIO in the guest and whalla. Should have no problems after that. The two virtual nics will have independant IPs in the guest and treat the vSwitch as a normal VM network switch with failover etc. The binding of the vmkernals to vmnics is needed for the ESX iSCSI initiator but not for the guest. Now the recommendation for additional vSwitch and nics has merit for additional bandwidth but it's not mandatory.

(EDIT#2: Remember that the vmk nic is bound to the iSCSI software initiator, so pulling the cable will issue a failover because of the iSCSI initiator and MPIO failover type for the volume. But when you have iSCSI in the guest, pulling the physical nic doesn't let the VM know it has a cable outtage, it's connected to the vSwitch so keeps on chugging away. This is why on a single virtual machine switch using something like hash or mac with two virtual nics you get the expected result.)

Give it a try and shoot me an email and we can discuss.

-Will

William Urban

Product Solutions Engineer

Dell Inc.

300 Innovative Way

Suite 301

Nashua, NH 03062

william_urban@dell.com

http://www.dell.com

http://www.equallogic.com

patters98
Enthusiast
Enthusiast

Thanks for that Will - I presume that solution does in fact require two more physical NICs which is what I plan to try once I get back from this week's training course. The new vSwitch will need to be bound to some more unassigned adapters since a vmnic can only be assigned to a single vSwitch.

0 Kudos
kyung1978
Contributor
Contributor

I would be interested to know the out come as well as I am having the same type of issue with our exchange 2007 environment where we need to present iscsi LUN directly into the guest OS

0 Kudos
william_urban_D
Enthusiast
Enthusiast

I've been able to replicate the original posters problem when using multiple VM Network port groups and manually binding nics. While I hadn't run across that before it is interesting to note. We normally suggest customers do a few different things based on hardware and administrative processes they want to do. (Software initiator examples, HBAs are more straight forward and use just option 3)

Scenario 1 Example: Lets say you have just two nics for iSCSI in your ESX server. Create a single vSwitch, put both ESX iSCSI vmkernel ports and a single VM Network Port Group (say call it iSCSIGUEST) on it. Bind the vmk ports as per the documentation but leave the VM Network Port Group alone. Assign a virtual VM nic to the Guest OS VM and put it on the iSCSIGUEST vswitch. Then from inside the VM, give the new virtual nic an IP address that can attach to the SAN and away you go.

Scenario 2 Example: This one is more used when you want more bandwidth or using our MPIO and HIT kit for advanced integration. Maybe you have 2 nics for iSCSI but you want to guarantee VM throughput with MPIO as well. Create two vSwitches. vSwitch1 has a single ESX iSCSI vmkernel and a single VM Network Port Group (iSCSIGUEST1) and then create vSwitch2 with a single ESX iSCSI vmkernel and a single VM Network Port Group (iSCSIGUEST2) and then do the binding of the vmkernels to the iSCSI initiator etc but for the VM create two virtual nics, one on each port group. Give them IP addresses, install HIT and MPIO and away you go. If you have 4 iSCSI nics and want to use 4 iSCSI ESX vmkernels and 4 iSCSI nics in the VM you create 4 etc.

Scenario 3 Example: If you have more resources you can create either 1 or 2 new vSwitches with the dedicated nic on it for iSCSI in the VM traffic and follow the same steps. Again, if using MPIO + HIT and want to guarantee both paths (VMware treats iSCSI network traffic as normal guest traffic so it does what it wants to with it) use 2 virtual nics and 2 vswitches with ports on it.

Hope that helps and we are currently writing up a paper to describe this more in depth.

Thanks,

-Will

0 Kudos
patters98
Enthusiast
Enthusiast

Thanks Will. These descriptions are often quite complex to follow, but from what I've understood from your post, it sounds as if to achieve what I wanted using 2 physical NICs I'd need to change to the kind of setup below which I drew in MSPAINT (so a single vSwitch per Physical NIC).

Is there any downside to this? I can't really see any, but of course you'll need to rewrite your Dell EqualLogic best practice guide. If you think that's fine then I'll give that a try as soon as I can.

As for why there is a path fail issue in the first place - would you agree that the issue is quite likely to be with the EqualLogic DSM driver? The best way to test would be to simulate the same failure on a physical machine running HIT, refer to Jelle's diagram of the network connectivity (have two additional physical switches to simulate the vmnics), and kill not the NIC uplink but the connection between these switches for one of the paths like so:

If my suspicion is correct, the HIT will continue to send traffic down the now dead network segment, just because the uplink is still ok, whereas presumably it should regularly be testing the path end to end.

I'm so glad you were able to replicate the problem though - huge thanks for that! Sometimes I get the impression I must be the first guy to try this...

0 Kudos
patters98
Enthusiast
Enthusiast

I've tested failover with my ESXi host config matching that network config above and whilst it does now fail over without killing the VM's instance of SQL I'm still not convinced it's working properly.

Here's what happens:

1) If I disconnect vmnic1, the hypervisor marks the two disconnected paths as dead in the Storage Manage Paths view. If I look in the Eql MPIO tab in MS iSCSI Initiator on the Guest I can see that only one connection exists to the SAN, and in a few minutes it doubles up. So two connections from a single IP at the Guest end (one to eth0 and one to eth1 on the SAN). The vss-control connection is maintained throughout. So in theory, my off-host VSS backups would be ok in this scenario.

2) I reconnected vmnic1 and let it settle until the MPIO tab in MS iSCSI Initiator showed a connection from each of the Guest's vNICs.

3) If I disconnect vmnic5, the hypervisor again fails over fine. The Guest froze up for about 5 seconds but crucially didn't fail. If I look in the Eql MPIO tab in MS iSCSI Initiator I can see only one connection, however - no vss-control connection:

Now this is where things seem to go wrong. I left it like this for about 20 minutes and it never initiated a second connection to the SAN from 192.168.200.49. So if a SAN interface failed later - I'd lose the guest. Improbable, but perfectly possible.

The vss-control connection was not re-established either, meaning that my backups would fail for this guest. I paid a lot of money to make sure this was redundant - but it seems that it isn't. Even clicking Refresh it stays like this:

4) I reconnected vmnic5. The hypervisor re-established all paths immediately, and on the Guest the vss-control connection was also reconnected promptly. I notice that it's never blue, and it's in the "Managed:No" category. Does this mean that its lack of redundancy is actually to be expected?

An entire 30mins after reconnecting vmnic5 I notice that there is still no reconnection for the failed Guest path. This VM has now lost its redundancy until a reboot forces it to reset its MPIO.

So all in all, given how much these companies market these technologies - they seem to be suprisingly rough around the edges. Should I really be testing to this degree? This is core product functionality. Can anyone else replicate this?

0 Kudos
william_urban_D
Enthusiast
Enthusiast

Hey Patters98, well the first question about the multiple connections from a single IP address is to be expected. Not only does the MPIO DSM enable MPIO across multiple links but it also allows for intelligent routing of the packets. If you have a single member it still has a default of 2 paths per volume so it will create a second path from the single IP address to another eth port on the array for some redundancy on the SAN side. As far as the other parts of your post let me hit up my lab and recreate what you have and see if I can get the same issues. When I tried it before across 2 vSwitches all my connections came back but let me redo what you did and I'll keep you posted.

Thanks!

-Will

0 Kudos
patters98
Enthusiast
Enthusiast

Thanks Will, really appreciate it! Yeah, I was expecting the doubling up of the connections in the MPIO, just weird it doesn't happen after more than one failure. Also, would you expect the vss-control connection to have an affinity for one of the guest NICs?

I need to test on a physical machine too, because I have a feeling that I'm seeing Eql HIT behaviour here, not VMware-related. Going home now though, it's late here...

0 Kudos
brianlaw
Contributor
Contributor

Hi Platter,

I met the same problem as you and lucky found your thread which explained in very details. Didn't see this thread update for months and wanna to know if you have any solution finally. Thank you very much.

Brian

0 Kudos
patters98
Enthusiast
Enthusiast

I never got anywhere with it I'm afraid. I'm going to push ahead with the latest upgrades before I investigate further (vSphere 4.0U1 -> 4.1, and the move to Eql FW5.0.1 + new HIT kit).

0 Kudos
Mintofoxburr
Contributor
Contributor

Just like to say Patters this was a great post I had been trying to do the same config about a year ago for our Exchange 2007 virtual environment so I could take advantage of the HIT Kit, DSM with ASM/ME. I finally got totally frustrated and confused to the point of just presenting the drives as RDM's through vitual center. Anyway this was an enlightening thread which cleared up a lot of confusion for me so please let us know if you have any success with the new version upgrades but hang on until version 5.0.2 is released as I just received an email from my Dell rep the other day in regards to bugs in the latest version. See Below

"You might have received this notice from Tech support but if you haven’t please don’t upgrade your firmware on Equallogic to 5.0.0 or 5.0.1. These two firmwares have minor bugs that may in rare cases affect your replication process."

Firmware 5.0.2 will be out soon and would be ok to upgrade to.

Please forward this to any storage administrator involved in storage firmware upgrades in your organization.

0 Kudos
patters98
Enthusiast
Enthusiast

Hi Mintofoxburr,

Thanks for the firmware heads up. Luckily I had received that email on the day before I planned to upgrade! Dodged a bullet there! Kudos to Dell for how they managed that too - they emailed all the people that had downloaded it.

I'm nearly at the point I can test out the new stuff. Got slowed down by a terrible upgrade process from vCenter 4.0 to 4.1 (http://pcloadletter.co.uk/2010/07/26/upgrading-to-vcenter-4-1/) but my cluster is running ESXi4.1 now and finally Broadcom TOE cards are recognised as HBAs. One hypervisor has the new EqualLogic PSP driver installed (supports Broadcom TOE HBAs) and I should be able to do some failover testing next week I think. I'll post back with my findings.

0 Kudos
brianlaw
Contributor
Contributor

Hi,

I am using the firmware to 5.0.1 and working on ESXi 4.0 U2. I submitted the diagnostic report to Dell and lucky they told me didn't found any problems and could keep on using.

Moreover, they provided some more info about the issue of the firmware.

Known issue of EQL firmware 5.0.0 / 5.0.1:

• Volumes might not come online properly immediately after the install

After the upgrade some volumes might go offline.

• Replication might not occur properly

The replication issue is with Multi Member groups.

• VMware V4.1 Zero offload performance might be affected

The ESX v4.1 issue can be avoided by temporarily disabling the VAAI features in ESX. However, since this also involves replicated volumes, this issue would also not affect here.

For the MPIO issue, they suggest me to have two vNIC, on each vNIC config the the two pNIC as Active/Standy approach.

0 Kudos
patters98
Enthusiast
Enthusiast

...thus choking your peak I/O throughput down to 1Gbps (125MB/s), when you bought an appliance that can handle double that. I'm not happy with that.

These things are sold to run active/active multipathing.

0 Kudos
patters98
Enthusiast
Enthusiast

Since there are a fair few people in this thread with similar setups - I wrote up my upgrade to 4.1 steps here:

http://pcloadletter.co.uk/2010/08/11/vsphere-upgrade-to-4-1-with-equallogic/

0 Kudos
DylanAtDell
Contributor
Contributor

http://This comment is regarding the EqualLogic FW versions 5.0.0 and 5.0.1, not multipathing

Brianlaw et al,

I work in Dell's EqualLogic Product Group and wanted to add a couple things. First, sorry about any trouble from the firmware updates. If you have installed either version, or got a new array with 5.0.0 or 5.0.1 on it, please call support, no matter if you haven't seen any issues. If you have not installed v5.0.0 or 5.0.1 yet, you should please wait a few more weeks before installing any updates to firmware above 4.3.6. As you indicated, our engineering team identified a few changes they needed to make to those firmware versions, and they’re planning to release an updated version of it (v5.0.2) on or around August 30, 2010.

Please call your EqualLogic support number and give them your service tag if you want more answers about this issue and your environment specifically. We try to act fast and communcate directly on these kinds of things, so please let us know if you have any additional questions.

Thank you, Dylan

0 Kudos