VMware Cloud Community
MaximH
Contributor
Contributor

iSCSI boot fails when connecting to LACP configured storage

Hi!

This is something that I've discovered in my lab configuration, and now I'm not quite sure of how to report/confirm this. So I thought I'd start here.

What started this was that after having successfully installed vSphere 5.5 on my iSCSI LUN, which was mounted through iBFT, I kept getting various errors about corruption when trying to boot the now installed ESXi image.

It was either a purple screen, saying:

"Could not load multiboot modules: Boot image is corrupted".

Or during the loading of components, on the initial boot sequence, I would get something like:

error loading /s.v00

fatal error: 33 (inconsistent data)

After googling and troubleshooting this for a couple of hours, trying numerous reinstalls, confirming installation media hash sums, etc, etc.. I decided to test that this wasn't a network issue of some kind.

To do this, I thought I'd start with simplifying the configuration on the host, by only connecting one network adapter to begin with.

Lo and behold, it booted right away. This was after it just moments ago told me that my boot image was corrupted on the previous boot attempt. No other changes were made.

To confirm that it was indeed an issue with multiple network adapters, I reconnected a second adapter on the host, and rebooted. Result: "Boot image is corrupted".

So my prevailing theory as of now is that the algorithms used by the hypervisor/esxi to confirm if the data is "inconsistent" or "corrupted" cannot handle that the data is comming from multiple interfaces on the storage unit. As is the case with my storage, which is configured with 2 interfaces that are link aggregated with LACP.

Here's an overview of my lab environment as it is configured now:

LabConfig001.PNG

The relevant switch configuration:

!
interface port-channel 1

     description STORAGE-001

     spanning-tree portfast

     switchport mode trunk

     switchport trunk allowed vlan add 100,2000

!

interface ethernet g1

     channel-group 1 mode auto

     description STORAGE-001-1

     spanning-tree portfast

!

interface ethernet g2

     channel-group 1 mode auto

     description STORAGE-002-2

     spanning-tree portfast

!

interface ethernet g5

     description ESX-001-1

     spanning-tree portfast

     switchport mode trunk

     switchport trunk allowed vlan add 100,2000

!

interface ethernet g6

     description ESX-001-2

     spanning-tree portfast

     switchport mode trunk

     switchport trunk allowed vlan add 100,2000

!

As you can see, the storage interfaces use link aggregation with LACP, while the ESXI interfaces are simply trunked, with no relation to each other.

Hope someone can help confirm this, or maybe provide some tips on how I can troubleshoot this further in order to find out why this configuration isn't working, which is should...

--

Max

Reply
0 Kudos
7 Replies
JPM300
Commander
Commander

Unless things have chagned iSCSI can't be used with LACP.  You will want to setup two different iSCSI connections in ESXi and 2 single connections from your  SAN.

You have two configuration options with this:

Option A.)

ESXi config:
Portgroup configuration:

iSCSI1 - vfk1 - 10.0.5.5  - vmnic4 ONLY, all other vmnics in the iSCSI vswitch are set to unused

iSCSI2 - vfk2 - 10.0.5.6 - vmnic5 ONLY all other vmnics in the iSCSI vswitch are set to unused

SAN Config

iSCSI Port 1 - 10.0.5.7

iSCSI Port 2 - 10.0.5.8

Bind the vfk1 and vfk2 in the iSCSI software initiator to it, this is done in the settings screen for the iSCSI software initator

Option B.)

Portgroup configuration:

iSCSI1 - vfk1 - 10.0.5.5  - vmnic4 ONLY, all other vmnics in the iSCSI vswitch are set to unused

iSCSI2 - vfk2 - 10.0.6.5 - vmnic5 ONLY all other vmnics in the iSCSI vswitch are set to unused

SAN Config

iSCSI Port 1 - 10.0.5.7

iSCSI Port 2 - 10.0.6.7

No port binding required

The VMware native MPIO will take over for your load balancing, you just have to set what you want Fixed, RR, MRU.  Most people in an active/active array will use Active/Active assuming its supported

If your using iSCSI HBA's in your ESXi host then use Option B the only difference is you don't need to setup the portgroups as you will just setup your iqn's/IP's in your HBA's

Hope this has helped

Reply
0 Kudos
MaximH
Contributor
Contributor

I was under the impression that my configuration was pretty much the recommended way of doing things, when utilizing port binding?

In reference to the 2nd image ("When to use port binding") in this KB article: VMware KB: Considerations for using software iSCSI port binding in ESX/ESXi

Any chance you have some reference links on iSCSI and LACP targets? I've tried to google the matter, but everything I find only talks in-depth about the iSCSI connection and options on the ESXi end, and not what it actually supports in terms of the targets.

What bothers me the most is that the configuration actually works pretty good once you get past the boot stage. When the host is running, and the port groups are configured, everything works as expected, with the load being balanced across the LACP connection on the SAN.

Reply
0 Kudos
JPM300
Commander
Commander

Chris Wahl has a really good article on this:

http://wahlnetwork.com/2014/03/25/avoid-lacp-iscsi-port-binding-multi-nic-vmotion/

nimble storage which is a up and coming SSD SAN provider and well established in the VMware community has a really good write up on it as well

https://connect.nimblestorage.com/servlet/JiveServlet/previewBody/1242-102-1-1173/LACP.pdf

I hope this has helped

Reply
0 Kudos
MaximH
Contributor
Contributor

Thanks for following up on this.

I've read through the articles you posted, but those again only discuss LACP and port binding on the hypervisor side.

In my configuration, there is no LACP or link aggregation of any kind on the hypervisor, nor will there be. I will use MPIO though port binding to achive load balancing and failover on the hypervisor side.

As I mentioned above, this is my exact configuration, only with 2 links instead of 4:

portbinding.jpg

This is why I don't understand why it's not working, and is also why I'm leaning towards thinking this might be a bug in the iSCSI boot logic on the hypervisor.

Reply
0 Kudos
JPM300
Commander
Commander

Hey,

In that diagram I can't remeber do they specifically state its LACP? or its just a grouped target?  As SANs like Dell Equallogics use a grouping or a VIP to the iSCSI controller system but it still doesn't use LACP.  The Group or VIP ip is just a management IP that then uses a MPIO driver to sort out which connection the traffic will enter / leave.

Well if the Storage Vendor support those ISCSI ports being in a LACP channel then it should be okay on that end.  However most SAN's I have seen doesn't support the LACP protocol on the iSCSI ports or the best practice is to break it up and allow the MPIO to sort it out.

However with that being said it can be done from some other sources I've read, one of them being FreeNas.   So it could very well be a bug in the iSCSI boot, as in most cases when we have tried to do boot from SAN with iSCSI its usually just ended with head aches or tears at some point :smileysilly:

Are you using an iSCSI HBA to boot from SAN?

The fact that it just completely fails when you have the LACP going, but works when you don't seems fishy.  Another way you could test this is don't boot from SAN and get the environment up and running.  Then create datastores with the SAN setup how you want it and test to see if any errors pop up.  If the problem really is a boot from iSCSI issue you shouldn't see any issues with the Datastores on the SAN, but if you do see errors it could possibly lead to you a KB article or a possible fix.

or

You could setup your SAN with two iSCSI networks and use port binding and see if you can boot successfully as well.  Either way should yield some results that could help assist in pin pointing the problem.

Hope this has helped.

Reply
0 Kudos
MaximH
Contributor
Contributor

Hi! Smiley Happy

Some in-line answers to the points you bring up:

JPM300 wrote:

Hey,

In that diagram I can't remeber do they specifically state its LACP? or its just a grouped target?  As SANs like Dell Equallogics use a grouping or a VIP to the iSCSI controller system but it still doesn't use LACP.  The Group or VIP ip is just a management IP that then uses a MPIO driver to sort out which connection the traffic will enter / leave.

The KB doesn't really specify anything about the storage side, except what is provided by the illustration. What's interesting is that they say that the following will/could happend if you don't do port binding on the hypervisor using the described configuration:

        • Unable to see storage presented to the ESXi/ESX host.
        • Paths to the storage report as Dead.
        • Loss of path redundancy messages in vCenter Server.

So taking the points above into account, one could say that those symptoms are kind of seen during the iSCSI boot sequence, when the storage suddenly cuts out, or something else malfunctions so that the data is perceived as "corrupt".

So the question is, if we have to do port binding on the hypervisor when using a single target IP storage that has multiple uplinks in order for it to work correctly, is this handled in the software iSCSI initiator/boot code?

Well if the Storage Vendor support those ISCSI ports being in a LACP channel then it should be okay on that end.  However most SAN's I have seen doesn't support the LACP protocol on the iSCSI ports or the best practice is to break it up and allow the MPIO to sort it out.

The storage vendor, in this case QNAP, supports and even recommends using LACP to increase throughput and provide failover capabilities with VMware.

Are you using an iSCSI HBA to boot from SAN?

No, it's done through the Intel network adapter.

The fact that it just completely fails when you have the LACP going, but works when you don't seems fishy.  Another way you could test this is don't boot from SAN and get the environment up and running.  Then create datastores with the SAN setup how you want it and test to see if any errors pop up.  If the problem really is a boot from iSCSI issue you shouldn't see any issues with the Datastores on the SAN, but if you do see errors it could possibly lead to you a KB article or a possible fix.

Now you've touched on what's most interesting/annoying about this, and why I suspect this is a bug in the iSCSI initiator in the hypervisor.

The LACP configuration, which is configured on the QNAP storage, never changes. The only change I do is unplug one of the network cables on the ESXi host, so that it only has a single uplink. But those connections have no relation to each other. They are configured as single trunked ports on the switch, as you can see in the configuration described in the first post.

The IP storage (QNAP) is, and always was, running LACP. So the only change is on the ESXi host, where it boots fine when only one uplink is connected, but fails when all uplinks (2) are connected. No LACP or etherchannel/port-channel is configured on the ESXi host.

Once the hypervisor is up and running, meaning I've unplugged all but one network cable to get it to boot, reconnecting the cables results in a perfectly working configuration. The datastores are mounted (from the iSCSI storage running LACP), the traffic flows as expected, VMs run without any problems on the mounted datastores, etc, etc. So the only issue is getting it to boot, without having to unplug all but one network cable on the host.

Too bad no one from VMware has seen or commented on this so far. It would be interesting to hear what they have to say, or think, about this case.

Reply
0 Kudos
JPM300
Commander
Commander

Which Intel NIC are you using?  If it just has iSCSI offloading but isn't a iSCSI HBA booting from SAN will probably not work or it will be iffy at best.  This could be part of the problem.  As if you don't have a Intel iSCSI HBA it can't establish the connections to the SAN until the ESXi is booted.  However it can't boot it as its not connecting to the LUN to boot it :smileysilly:

The fact that it works when you just unplug one cable is interesting, however even if it is still in a LACP, with 1 connection down there is only 1 path to take so there is no way the MPIO can get confused.  I have a feeling the MPIO driver is having an issues with connections,arp, macaddress, or something when both connections are up in the LACP channel.

When it comes to the iSCSI binding, you only have to bind the nics to the iSCSI vfk's when both port groups are on the same network.  If you had iSCSI1 on 192.168.5.x and iSCSI2 on 192.168.6.x it wouldn't require the bind.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203886...

Could you provide us with the Intel NIC model and we can look up any driver issues or KB articles.

Reply
0 Kudos