Re: Failover problems between esx an Dell EQL PS40...

s_buerger · ‎07-21-2010

Hi,

I already have a case opened with vmware about this issue for some time with still no solution, but I hope someone has seen this behavior.

We have a news PS4000 and I did some failover tests. I setup as suggested in the TR1049 guide from Dell.

Setup:

1 ESX 4.0u1 (HP DL380G5, 2 Dual Port Intel PT1000 for iSCSI)

2 Dell 5224 Gigabit Switches dedicated for iSCSI (newest firmware)

1 PS4000

actual configuration (already tried different setups with only two vmks and only two pnics or 4vmks with 2 pnics):

4vmks through 4 vswitches to 4 pnics

2 pnics to switch1, 2 pnics to switch2

the switches are connected through a LAG with 6gbit

The san is connected with one interface to switch1, one to the other (passive controller same, not used for this failover test)

Now when I connect to a volume I get 4 logical connections.

It is possible that some connections goes through the lag, because the connection first goes to the group ip and the array says to which interface ip the client should connect.

For example:

source: esx interface on switch2

destination: san interface on switch1

This connection goes esx <-> switch2 <-> switch1 <-> san

Now when I power off switch1 all connections that goes through switch1 are dead which is okay.

The connection that goes from esx interface on switch2 to san interface on switch1 should failover and reconnect to the group ip which should tell the esx it should now connect to the interface ip on switch2, but it didn't, not after seconds, not after hours.

Now when I power on switch1 again all connections are reconnected without the connections that goes through the lag, they are still marked dead.

If I did a rescan they are gone. They are back when I restart the esx server.

So failover doesnt work as expected. And failback works only for some connections.

My question is:

Have someone seen this behavior in failover tests or does it worked for you?

If it worked, are you sure the connections where going through the LAG, because direct connections are no problem.

The worst case (which is more likely with 2 logical connections as with 4) is that all connections to a volume goes through one switch and when this switch goes off, all connections are dead with no failover.

Some additional behavior:

When switch1 is off, I can't vmkping anymore. I should still vmkping the second interface and the group ip but there is no response. there is a command line switch where I can tell which interface vmkping should use, but no luck, still no vmkping. It seems that vmkping only uses the first vmkernel interface. From a management server (win2k3) I still can ping the interface on switch2 and the group ip.

RParker · ‎07-21-2010

Forget about the switch, forget about the SAN.

since you say ESX is failing, this should only involve the ESX host. So 2 pnics to a 1 vSwitch.

BOTH pnics are on the SAME dell switch (this will elminate path problems). If you pull one pnic connection, what happens?

what does VM ware say about the problem, I find it difficult to believe they simply ignored it or just left it, if it is truly an ESX issue (I have my doubts).

If it is an ESX problem the above scenario should not work, if it does, then there must be something else configured wrong on the switch.

s_buerger · ‎07-21-2010

Hi,

I can do this test tomorrow,

but I believe this test will work, because it’s not the same.

This failover/failback already

works within my setup, all connections that goes straight through one switch will

be marked dead and they reconnect within seconds after the switch is powered on

again. So a ethernet link down is detected from esx and the reconnect is initiated when the link is up again.

In my example the connections that marked dead still have a ethernet link on the esx side, they just can't connect to the old storage interface ip, because it is down, so they should switch to another ip, or if this is not possible, they should reconnect when switch1 is back again (its possible that there is a timeout and they didnt reconnect after some time).

VMware isn't ignoring this problem, but it took a while to explain the setup/configuration and the problem. I think they are now trying to reproduce this error. I just hope someone already seen this behavior and help to reduce the time (need to get this setup in production).

Sven

some weird css styles removed

J1mbo · ‎07-21-2010

Unless you are planning to have more PS series arrays in this group, and need mega sequential throughput, it seems to me there is little advantage having more than 2 pNICs for iSCSI since the PS4000 has only two ports.

However there seem to be a few oddities in the configuration you have. I think it should be (per esx server),

- one vSwitch for iSCSI

- two (or four if you like) pNICs

- one vmkernel port per pNIC (should be both on the same subnet)

- for each vmkernel port, set the adapter failover order manually, with one bound adapter only (the other(s) disabled)

- manually bind the NICs to the sw-iSCSI through the command line

- map the LUNs and set the policy to round-robin

- set the IOPS value per LUN to 3 if you want to get the sequential throughput advantage of the multiple links

On the EqualLogic plumb one iSCSI interfaces of each controller to each physical switch, so that each controller is connected to both switches

On the switch enable flow control on all interfaces.

More info here: http://www.techhead.co.uk/dell-equallogic-ps4000-hands-on-review-part-2

http://blog.peacon.co.uk

Please award points to any useful answer.

s_buerger · ‎07-21-2010

Hi J1mbo,

first I did exact this configuration.

But I had the explained problems with the connections that went through the LAG.

To make more logical connections I then went to a "overcommitment" as described in the Dell Equallogic Technical Paper TR1049.

Because with more logical connections it is less likely that all goes through one switch. But this is only a workaround it is still possible that all connections goes through one switch.

My current setup is like http://communities.vmware.com/message/1460812#1460812 (but with 4 pnics and 2 portgroups for vm iscsi on 2 of the 4 vswitches). VMWare told me this sharing would be no harm to the vmfs connections.

I know, from the side of the throughput, 4gbit is much more as I need, but without a correct failover behavior I need this connections for redundancy.

Another example what I mean:

The connection process is like: one ESX vmkernel port connects to the group ip, the EQL tells the esx it should connect to interface ip X, then the ESX makes the connection to this IP. They do it for every vmkernel port again.

Now the following is possible (2 esx connections) and it happens some of the time:

esx pnic1 <-> switch1 <-> switch2 <-> eql interface 2

esx pnic2 <-> switch2 <-> switch1 <-> eql interface 1

This is possible because no one knows the layer 1/2 way that goes from esx interface X to eql interfaceY. And in the connection process the eql just choose the the interface with the least queue or something like that. (this is more a design flaw, the connection should use the "shortest path").

Now if switch1 goes down all connections are killed, and the esx doesnt make any failover because he doesnt reconnect the connection from pnic2 to the other eql interface 2. It still tries to connect to ip of eql interface 1. It just should connect to the group ip of the san to "request" a new interface ip.

The second problem here is, when switch1 is back again, only the first of the two connections is reconnected. Somehow the second is still unable to reconnect. Don't know for now if this is because of some timeout happened when switch1 was off.

So I have two problems, failover of all connections that goes through the lag and where the esx side of the connection still have a link, and the failback of this connections when the other switch is back again.

All connections that goes right through one switch dont have a problem. They are dead when the switch is off or alive when the switch is on.

Sven

pfuller · ‎07-21-2010

What is the Path Selections Policy?

ESX Host --> Configuration -->Storage Adapters --> iSCSI Software Adapter

Right Click on LUN and select Managed Paths.

It should be Round Robin(VMware) unless you have a policy from Equallogic.

How many Paths show? How many active paths are there?

s_buerger · ‎07-21-2010

Hi pfuller,

its round robin (fixed doesnt fix the described problem, tried that).

All paths are activ (i/o) before the test.

Here is the whole documentation of the latest failover/failback test:

4 vmk, 4 vswitch, 4pnics

Vmnic4 + 5 = eine Dualport GBit, Vmnic6 + 7 = eine Dualport GBit

Vmnic4 + 6 <-> Switch1, Vmnic 5 + 7 = Switch2

As you can see vmhba33:c3:t2:lo and t3:lo goes through the esx interfaces that are connected to switch2 to the san interface on switch1, so they go through the lag.

Now I powered off switch1:

INFO

14.07.10 17:03:18 ps2 iSCSI session to

target '192.168.55.22:3260,

iqn.2001-05.com.equallogic:0-8a0906-dccf80507-032000d06af4c341-s2-esx-vm4' from

initiator '192.168.55.52:51742, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. No response

on connection for 6 seconds.

INFO

14.07.10 17:03:18 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-dccf80507-032000d06af4c341-s2-esx-vm4' from

initiator '192.168.55.53:49614, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:18 ps2 iSCSI session to

target '192.168.55.21:3260, iqn.2001-05.com.equallogic:0-8a0906-dccf80507-032000d06af4c341-s2-esx-vm4'

from initiator '192.168.55.50:49605, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.22:3260,

iqn.2001-05.com.equallogic:0-8a0906-79ff80507-0aa000d06ac4c341-s2-esx-vm3' from

initiator '192.168.55.52:61315, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. No response

on connection for 6 seconds.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-059050507-e2c000000264c20b-s2-esx-vm1' from

initiator '192.168.55.52:55552, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:17

ps2 iSCSI session to target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-79ff80507-0aa000d06ac4c341-s2-esx-vm3' from

initiator '192.168.55.53:49607, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-79ff80507-0aa000d06ac4c341-s2-esx-vm3' from

initiator '192.168.55.50:49603, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network link

state changed to down.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-059050507-e2c000000264c20b-s2-esx-vm1' from

initiator '192.168.55.50:61983, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-11ef80507-558000d06a44c333-s2-esx-vm2' from

initiator '192.168.55.50:52425, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:17 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-11ef80507-558000d06a44c333-s2-esx-vm2' from

initiator '192.168.55.52:61971, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:13 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-c34050507-4c0000000294c21f-santest2' from

initiator '192.168.55.182:1032, iqn.1991-05.com.microsoft:san2' was

closed. iSCSI initiator connection failure. Network

link state changed to

down.

INFO

14.07.10 17:03:13 ps2 iSCSI session to

target '192.168.55.21:3260,

iqn.2001-05.com.equallogic:0-8a0906-decf80507-c8e000d06b24c341-s2-esx-vm5' from

initiator '192.168.55.52:61856, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

INFO

14.07.10 17:03:13 ps2 iSCSI session to

target '192.168.55.21:3260, iqn.2001-05.com.equallogic:0-8a0906-decf80507-c8e000d06b24c341-s2-esx-vm5'

from initiator '192.168.55.50:49611, iqn.1998-01.com.vmware:esx1-41360ad8' was

closed. iSCSI initiator connection failure. Network

link state changed to down.

All connections that goes directly to or from switch1 are dead which is correct.

All connections going through the LAG between Switch2 <-> Switch1 are dead which is also correct.

All vmfs without vm3 and vm4 have still 2 paths.

Question is:

Why doesn’t esx reconnect vmhba33:C3:T2:L0 and vmhba33C3:T3:L0 in x seconds?

There is no path to .21 anymore but it should reconnect through .20?

And I believe this is a big problem.

It’s possible that all logical paths to a vmfs went down. I already saw connections going completely througha switch (some direct, some came through the LAG).

*And the "LAG" connections didn’t switch over within seconds. I think because the esx still try to

connect the Target on .21 not on .20.*

I waited hours to see if this booth connections reconnects to the san, they didn't.

Then I powered on switch1 again and all connections without this two was back in seconds.

Now I did a rescan and booth connections are gone from the list until I rebooted the server.

J1mbo · ‎07-22-2010

Why do you have multiple vswitches? The configuration I linked to above works perfectly, I would suggest using that.

http://blog.peacon.co.uk

Please award points to any useful answer.

s_buerger · ‎07-22-2010

sorry, but as i already explained i had the same issues with your setup.

After that I changed to the new configuration because it can uses 4 physical nics instead of just two.

J1mbo · ‎07-22-2010

What I've posted is the recommended configuration by EqualLogic.

Since you have multiple vSwitches there is no way for a vmkernel on a vswitch with disconnected media to connect to anything, as you have demonstrated.

http://blog.peacon.co.uk

Please award points to any useful answer.

s_buerger · ‎07-22-2010

Sorry but you did not understand what i mean.

The first test I did was with the same configuration you posted.

It had the same behavior.

All connections that goes direct through only one switch are worked as expected, but the connections that goes through the LAG didnt failover or failback.

my first setup:

Switch Name Num Ports Used Ports Configured Ports MTU Uplinks

vSwitch5 64 5 64 1500 vmnic5,vmnic7

PortGroup Name VLAN ID Used Ports Uplinks

iSCSI2 0 1 vmnic7

iSCSI1 0 1 vmnic5

vmnic5 on switch1

vmnic7 on switch2

switch1 and 2 are connected through a LAG

eql

eth0 on switch1, eth1 on switch2

This is exact the setup you posted.

Now again the problem. One of the possible connections are:

First path goes: vmnic5 <-> switch1 <-> switch2 <-> eql eth1

Second path goes: vmnic7 <-> switch2 <-> switch1 <-> eql eth0

Do you agree?

And this can happen because when the connection is established, no one (not the esx and not the eql) knows about my physical setup...

Now power of switch1. What happens?

Here all connections are dead and none of them are failover.

But the second path should reconnect to eql eth1. But this doesnt work.

J1mbo · ‎07-22-2010

You've gotten me doubting this now....

But I think we need to consider that it is the ESX sw-iSCSI component making the connections, not the pNICs in the server (whcih as ever are uplinks).

Hence, provided the sw-iSCSI bindings have been configured, makes the diagram something more like,

sw-iSCSI initiator bound to...

- vmkernel port 1 -> vSwitch -> pNic1(A) -> pSwitch1 -> pSwitch2 -> EQL-1

- vmkernel port 2 -> vSwitch -> pNic2(A) -> pSwitch2 -> pSwitch1 -> EQL-2

Herein lies the problem..., since we've set the failover order on the vmkernel ports to disable the secondary pNIC, as you state it seems there is no way the traffic can be routed. If this was set manually with the opposite NICs listed as standby traffic would obviously be switched through that route, i.e.

- vmkernel port 1 -> vSwitch -> pNic1(A) | pNic2 (S) -> pSwitch1 -> pSwitch2 -> EQL-1

- vmkernel port 2 -> vSwitch -> pNic2(A) | pNic1 (S) -> pSwitch2 -> pSwitch1 -> EQL-2

In that scenario when pSwitch1 is failed we are left with one working path:

- vmkernel port 1 -> vSwitch -> pNic2 -> pSwitch2 -> EQL-1

- vmkernel port 2 -> vSwitch -> pNic2 -> pSwitch2 ->

I must be missing something....

s_buerger · ‎07-22-2010

This could be a possible solution for the problem.

I'll try this.

It could still make another problem, because now vmkernel port 2 have a dead connection similar like the connection before through the LAG.

It still have a link to the switch but nothing to connect to. But we will see...

s_buerger · ‎07-22-2010

trying to reconfigure:

esxcfg-vmknic -l

Interface Port Group/DVPort IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type

vmk0 vMotion IPv4 192.168.56.51 255.255.255.0 192.168.56.255 00:50:56:7f:b2:59 1500 65535 true STATIC

vmk1 iSCSI1 IPv4 192.168.55.50 255.255.255.0 192.168.55.255 00:50:56:72:da:22 1500 65535 true STATIC

vmk2 iSCSI2 IPv4 192.168.55.51 255.255.255.0 192.168.55.255 00:50:56:7a:4a:60 1500 65535 true STATIC

but

# esxcli swiscsi nic list -d vmhba33

vmk1

pNic name:

ipv4 address: 192.168.55.50

ipv4 net mask: 255.255.255.0

ipv6 addresses:

mac address: 00:00:00:00:00:00

mtu: 1500

toe: false

tso: false

tcp checksum: false

vlan: false

link connected: false

ethernet speed: 0

packets received: 0

packets sent: 0

NIC driver:

driver version:

firmware version:

vmk2

pNic name:

ipv4 address: 192.168.55.51

ipv4 net mask: 255.255.255.0

ipv6 addresses:

mac address: 00:00:00:00:00:00

mtu: 1500

toe: false

tso: false

tcp checksum: false

vlan: false

link connected: false

ethernet speed: 0

packets received: 0

packets sent: 0

NIC driver:

driver version:

firmware version:

Before this change this output also showed the mac address and driver versions...

No change after reboot.

But have 2 paths to every volume.

Will now remove the whole iscsi configuration and make it new.

pfuller · ‎07-22-2010

This may help I found this video: http://www.equallogic.com/resourcecenter/video.aspx?id=8455

s_buerger · ‎07-22-2010

Hi J1mbo,

2 hours and its sad but it is not possible to assign vmk's with one active and one standby pnic interfaces to the iscsi initiator. 😕

esxcfg-vmknic -l

Interface Port Group/DVPort IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type

vmk0 vMotion IPv4 192.168.56.51 255.255.255.0 192.168.56.255 00:50:56:7f:b2:59 1500 65535 true STATIC

vmk1 iSCSI1 IPv4 192.168.55.50 255.255.255.0 192.168.55.255 00:50:56:72:da:22 1500 65535 true STATIC

vmk2 iSCSI2 IPv4 192.168.55.51 255.255.255.0 192.168.55.255 00:50:56:7a:4a:60 1500 65535 true STATIC

Switch Name Num Ports Used Ports Configured Ports MTU Uplinks

vSwitch3 64 5 64 1500 vmnic4,vmnic6

PortGroup Name VLAN ID Used Ports Uplinks

iSCSI2 0 1 vmnic6,vmnic4

iSCSI1 0 1 vmnic4,vmnic6

# esxcli swiscsi nic add -n vmk1 -d vmhba33

Errors:

Add Nic failed in IMA.

# esxcfg-vswitch -p iSCSI1 -N vmnic6 vSwitch3

# esxcli swiscsi nic add -n vmk1 -d vmhba33

So this is no solution.

s_buerger · ‎07-22-2010

thanks, I already used the TR1049 to configure my connections.

there should be a part4 where they test the setup with 2 switches interconnected with a LAG.

pfuller · ‎07-22-2010

One thing I am looking in to is "EqualLogic Multipathing Extension Module for VMware® vSphere' that just came out this month.

It is under the VMware Integration download section.

EqualLogic Multipathing Extension Module for VMware® vSphere

Version 1.0.0

Date Released July 2010

Notes:

The EqualLogic Multipathing Extension Module (EqualLogic MEM) provides the following enhancements to the standard VMware vSphere multipathing functionality:

Automatic connection management
Automatic load balancing across multiple active paths
Increased bandwidth
Reduced network latency

The EqualLogic MEM V1.0.0 supports vSphere ESX/ESXi v4.1

s_buerger · ‎07-22-2010

thats something. will try it out.

the mem is only for esx 4.1. I can upgrade this test system but cant upgrade the other 2 productive systems right now and they also have to work in this san network. 😕

J1mbo · ‎07-22-2010

The odd thing is I tested the 'recommended' configuration to death and it worked perfectly in all situations (switches, cables, controller). But looking at it, I don't now follow exactly how it survived a switch failure.

http://blog.peacon.co.uk

Please award points to any useful answer.

All

Failover problems between esx an Dell EQL PS4000

Now I powered off switch1: