Solved: Re: Problems with iSCSI Multipath I/O

justgraham · ‎05-22-2009

Hi Folks,

I've just started testing with ESXi 4 and have hit a snag with the Software iSCSI Adaptor and Multipath I/O.

From a bit of background reading, I understand that the storage architecture is a little different to 3.5, with the new PSA.

Anyway, here's the set-up.

Single ESXi 4.0 Host

- vSwitch1

- Two physical uplink NICs

- Two port groups, with a VMKernel interface in each (See attachment 'Storage4-Network.jpg')

- 10.42.80.107 & 10.42.80.108

- The adaptor order has been over-ridden in the port groups so that only 1 adaptor is active in each respective group

- Added both of the VMKernel interfaces (vmk1 & vmk2) to the iSCSI Adaptor, using the 'esxcli swiscsi nic' commands

- Basically, everything described in the VMWare "iSCSI SAN Configuration Guide"

Thecus R4500i iSCSI SAN

- Two physical NICs

- Each physical NIC has it's own IP Address

- 10.42.80.200 & 10.42.80.201

So in this configuration, I should end up with 4 paths:

ESXi 10.42.80.107 - SAN 10.42.80.200

ESXi 10.42.80.107 - SAN 10.42.80.201

ESXi 10.42.80.108 - SAN 10.42.80.200

ESXi 10.42.80.108 - SAN 10.42.80.201

Which is what I get if I look in "Configuration\Storage Adaptors\iSCSI Software Adaptor" (see attached 'Storage2-Paths.jpg').

But here's the problem, when I take a look in "Configuration\Storage\Devices\iSCSI Software Adaptor\Manage Paths", I only get a single path (see attached "Storage3-Paths.jpg").

Can anyone help explain this? Have I missed something?

This same set-up (Servers, NICs, SAN, etc) is all fine with multiple paths in ESXi 3.5

(And yes, I know that the 'Thecus' isn't in the supported HCL, but we're just talking standard iSCSI here, no additional Storage Plugins, etc)

Not uber-urgent, but annoying, as this lack of multipathing would stop a deployment of vSphere.

Thanks,

Graham.

paithal · ‎05-28-2009

Looking at the trace, it appears that, the array is not confirming to the spec when it is sending the VPD data. The NAA id format is wrong. According to "spc3r23, 7.6.3.6.1 NAA identifier basic format", the first nibble of the NAA id can be 0x2, 0x5, or 0x6. However, the trace shows 0x9. ESX 4.0 doesn't register multiple paths to the LUN b'cos of this. Is there a way to disable "Device Identification Page" from array side ?.

View solution in original post

kunhuang · ‎05-26-2009

Can you try rescan the sw iscsi in pic2, and refresh in pic3?

If that does not solve it, please post output of :

esxcli nmp device list

and

esxcli swiscsi nic list -d vmhba??

My guess is that the path state is not refresh somehow, but they are actually connected through 4 paths, since pic2 show the paths are active.

Thanks!

- Kun

paithal · ‎05-26-2009

I am guessing there is some issue with LUN UUID's. Can you perform a rescan and post/attach the messages from /var/log/messages around the time of rescan ?.

justgraham · ‎05-27-2009

Hi

Thanks for coming back to me.

I've tried a Rescan of the ISCSI HBA (and a re-boot of the ESXi Server), but the issue persists.

Below is the output of the requested commands:

~ # esxcli nmp device list

mpx.vmhba0:C0:T0:L0

Device Display Name: Local TEAC CD-ROM (mpx.vmhba0:C0:T0:L0)

Storage Array Type: VMW_SATP_LOCAL

Storage Array Type Device Config:

Path Selection Policy: VMW_PSP_FIXED

Path Selection Policy Device Config: {preferred=vmhba0:C0:T0:L0;current=vmhba0:C0:T0:L0}

Working Paths: vmhba0:C0:T0:L0

naa.600508e0000000000039b6217eccc40e

Device Display Name: LSILOGIC Serial Attached SCSI Disk (naa.600508e0000000000039b6217eccc40e)

Storage Array Type: VMW_SATP_LOCAL

Storage Array Type Device Config:

Path Selection Policy: VMW_PSP_FIXED

Path Selection Policy Device Config: {preferred=vmhba1:C1:T0:L0;current=vmhba1:C1:T0:L0}

Working Paths: vmhba1:C1:T0:L0

mpx.vmhba33:C0:T0:L3

Device Display Name: Thecus iSCSI Disk (mpx.vmhba33:C0:T0:L3)

Storage Array Type: VMW_SATP_DEFAULT_AA

Storage Array Type Device Config:

Path Selection Policy: VMW_PSP_FIXED

Path Selection Policy Device Config: {preferred=vmhba33:C0:T0:L3;current=vmhba33:C0:T0:L3}

Working Paths: vmhba33:C0:T0:L3

~ # esxcli swiscsi nic list -d vmhba33

vmk1

pNic name: vmnic2

ipv4 address: 10.42.80.108

ipv4 net mask: 255.255.255.0

ipv6 addresses:

mac address: 00:14:4f:01:94:d6

mtu: 1500

toe: false

tso: true

tcp checksum: false

vlan: true

link connected: true

ethernet speed: 1000

packets received: 8423

packets sent: 3951

NIC driver: e1000

driver version: 8.0.3.1-NAPI

firmware version: N/A

vmk2

pNic name: vmnic1

ipv4 address: 10.42.80.107

ipv4 net mask: 255.255.255.0

ipv6 addresses:

mac address: 00:14:4f:01:94:d5

mtu: 1500

toe: false

tso: true

tcp checksum: false

vlan: true

link connected: true

ethernet speed: 1000

packets received: 2007

packets sent: 1494

NIC driver: e1000

driver version: 8.0.3.1-NAPI

firmware version: N/A

Also attached are the lines from /var/log/messages after performing a Rescan.

Thanks in advance for the help.

Regards,

Graham.

justgraham · ‎05-27-2009

Sorry,

I should also follow that up with some additional testing.

I've confirmed that only 1 path is active (i.e. it's not a GUI issue), as I've tested disabling the switch port to the active iSCSI NIC and things die in a very horrible way!

- None of the other paths take over as expected, so they're certainly not being seen as secondary paths

- The whole ESXi Server will lock-up and become un-responsive until the iSCSI NIC switch port is re-enabled

Thanks again,

Graham.

paithal · ‎05-27-2009

...

May 27 07:43:04 vmkernel: 0:00:25:02.627 cpu3:11806)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba33:C3:T0:L3' failed. Already exists

...

This confirms that there is some issue with LUN uuid (The one we get from VPD inquiry). I can't see any logging indicating what is wrong, whether VPD inquiry failed or the data in the response isn't good. I think we need network trace during the rescan. Is it possible for you to get a network trace during the rescan operation ?. Since you are using ESX 4i, you will need to obtain the trace externally.

BTW, could you provide more info on the storage, I have never used this storage before:

...

May 27 07:43:04 vmkernel: 0:00:25:02.626 cpu0:10845)ScsiScan: 839: Path 'vmhba33:C0:T0:L0': Vendor: 'Thecus ' Model: 'iSeries ' Rev: '2.3.'

...

asp24 · ‎05-27-2009

Shouldn't the VMkernels be on separate subnets (with a corresponding nic/interface on the SAN).

Now both interfaces/nics are on the same subnet, and that could be the problem I think.

What about changing the second nic of the SAN to the 10.42.81.x subnet (or something other not inside the 10.42.80.x subnet) and changing the IP-address of the second vmkernel interface to a corersponding ip address (and put them on a separate vlan if possible)

Correct me if I'm wrong!

depping · ‎05-27-2009

this used to be the case for 3.5, but not for esx 4.

Duncan

VMware Communities User Moderator | VCP | VCDX

-

Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

asp24 · ‎05-27-2009

But the "cheap" SAN might have some problems with it?

justgraham · ‎05-28-2009

Hi All,

Thanks for the feedback. Just some quick responses

asp24 - I'm totally agree that the 'low cost' storage that we're using may not be ideal, but my concern is that it's working with ESXi 3.5

I take you point about trying to have the storage NICs in different IP subnets, but this warning from the VMWare iSCSI Configuration Guide (page 32) led me towards keeping everything in the same subnet:

"CAUTION If the network adapter you add to software iSCSI initiator is not in the same subnet as your iSCSI target, your host is not able to establish sessions from this network adapter to the target."

Therefore keeping both the ESXi VMKernel iSCSI NICs and the iSCSI Target NICs in the same subnet, gives you an 'any-to-any' approach.

For a session / connection perspective, I believe that both ESX and the iSCSI storage are connecting fine. If you check out the 'Storage2-Paths.jpg' in the original post, there are 4 connections, as expected. Also on the storage, 4 sessions are established.

The problem is that ESXi is only using 1 of the paths and disregarding the others. So when that path breaks, it doesn't failover to an alternate path. Not fun...

Unfortunately as the storage is used in production, I can't change the networking to test your theory. But thanks for the suggestion, all ideas are gratefully received!

deeping - Thanks for the confirmation.

paithal - OK, good to know that there may be an issue. I appreciate you taking a look into it.

Here's the link to the storage provider's website: http://www.thecus.com/index.php?set_language=english

The specific model that we're using is their i4500R: http://www.thecus.com/products_over.php?cid=32&pid=68&set_language=english

The network trace could prove interesting as the test system is in our main office and I work a few hundred miles away. But, I like the challenge, so leave it with me for little while and I'll see what I can figure out.

And by network trace, I'm assuming you mean something along the lines of a tcpdump and Wireshark/Ethereal capture. What format works best for you - is pcap OK?

Thanks again to all for their support.

Graham.

asp24 · ‎05-28-2009

Nothing "rude" (do not know a better way to describe this with my english skills) ment about the "cheap san". We are also using a couple simular ones. I was just referring to equipment costing less than $50k and that might not have been designed/testet against VMware the way HP/IBM/EMC etc is.

I think the "keep on the same subnet"-statement is referring to not "routing" the iscsi traffic, but have iscsi SAN and iscsi vmkernel on the same subnet. I don't think they mean not having multiple subnets with iscsi traffic/vmkernels.

I hope you are able to solve your problems!

My Infortrend ISCSI san is configured like this (esx finds two paths and failover works):

Gig1 port Infortrend 192.168.251.2/24 iscsi service console esx 192.168.251.x1 vmkernel esx 192.168.251.x2 (where x is esx host internal numbering)

Gig2 port Infortrend 192.168.252.2/24 iscsi service

console esx 192.168.252.x1 vmkernel esx 192.168.252.x2

(where x is esx host internal numbering)

The problem I think the Thecus box might have is the same as some computers might have when configuring two different nics in the same subnet. It is problably only using one of the nics. And maybe the "failover" is not happening automaticly (or too slow) ? That is why I think two different subnets could be the solution.

justgraham · ‎05-28-2009

Hi,

Don't worry, I'm not offended!

I think you could well be right about the different subnets, but unfortunately I can't test it due to the storage being production. I'll have to have a word with the Boss and see if I can 'break' some stuff at the weekend...

With regard to your set-up,

Are you running ESX or ESXi?

Are you running v3.5 or v4.0

Just curious as you mentioned having a 'service console'.

I'm also working on getting some packet captures to see what more we can discover.

Regards,

Graham.

asp24 · ‎05-28-2009

I'm running ESX 4.0 (not i)

The reason for the service consoles is that I have upgraded from ESX 3.x. and there iscsi required a service console for iscsi auth++.

justgraham · ‎05-28-2009

Hi paithal,

OK, I think I've managed to get a packet capture.

I had to tweak the environment a little, as the ESXi iSCSI NICs are on different switches and I could only capture off one.

So for this capture, I only had one of the ESXi iSCSI NICs connected - vmk1 (10.42.80.108)

This should still give me 2 possible paths:

ESXi 10.42.80.108 - SAN 10.42.80.200

ESXi 10.42.80.108 - SAN 10.42.80.201

Which is what happened and is confirmed in the attached screenshot (Storage5-Paths.jpg).

But still, only one path was seen in the Manage Paths dialogue, just as before (original post Storage3-Paths.jpg).

The full capture was too big to post (90MB) as it contained a whole load of SCSI Writes.

So I've filtered it down to what I think (with my limited knowledge) would be useful.

ESXi-iSCSI-Rescan-Filtered.txt (.pcap file)

This is based on this filter: (scsi.sbc.opcode == 0x00) or (scsi.sbc.opcode == 0xa0) or (scsi.sbc.opcode == 0x12)

With the Op Codes being:

Test Unit Ready: scsi.spc.opcode == 0x00

Report LUNs: scsi.spc.opcode == 0xa0

Inquiry LUN: scsi.spc.opcode == 0x12

ESXi-iSCSI-Rescan-Filtered1.txt (.pcap file)

This is based on this filter: iscsi.initiatortasktag == 0x00000001

Which is the iSCSI Send Targets task.

FYI - In the iSCSI Dynamic Discovery property dialogue, I've configured 10.42.80.200 (one of the SAN IPs)

If there's something different or additional that you need, let me know, I can just re-filter the original capture. Or PM me and we can work out a way to send the original file.

Any light that you can shed would be appreciated.

Regards,

Graham.

justgraham · ‎05-28-2009

duplicate post - sorry

justgraham · ‎05-28-2009

duplicate post - sorry

justgraham · ‎05-28-2009

duplicate post - sorry

paithal · ‎05-28-2009

Looking at the trace, it appears that, the array is not confirming to the spec when it is sending the VPD data. The NAA id format is wrong. According to "spc3r23, 7.6.3.6.1 NAA identifier basic format", the first nibble of the NAA id can be 0x2, 0x5, or 0x6. However, the trace shows 0x9. ESX 4.0 doesn't register multiple paths to the LUN b'cos of this. Is there a way to disable "Device Identification Page" from array side ?.

justgraham · ‎06-01-2009

Thanks for looking into the trace.

I think you've hit the nail on the head. I've had a look through the options on the Storage device, but there isn't any way that I can see to disable the "Device Identification Page" as you've suggested.

Also over the weekend I did more testing around changing the Storage to use IP Addresses in different subnets, as suggested by asp24. This had the same results as my original set-up, with only one path being available for selection.

Then finally, to confirm that the Storage was the heart of the problem, I did more testing against a FreeNAS device. vSphere / ESXi performed exactly as expected with regard to the multipathing. In several different configurations, multiple paths were available for selection in the Manage Paths GUI.

Because I was starting to confuse myself with the option, I knocked-up a quick doc just for my own reference (attached).

To other readers: This is by no means 100% accurate, this is just my experience from the limited testing that I've done. A huge amount depends on how your storage works and what aggregation features your network can provide. Please don't take it as gospel, but maybe just use it to try different scenarios of your own.

Thanks again to all who posted responses to this thread. It has been very much appreciated.

Graham.

paithal · ‎06-01-2009

You should contact the storage vendor to resove this multi-path issue. This storage doesn't seem to be certified with ESX (3.x or 4.0).