VMware Cloud Community
dmarshallx
Contributor
Contributor

Boot issues with ESX 3.51 (and ESX3.5) on IBM LS21 using QLA4022 and EqualLogic San

We installed ESX 3.5.1 on a new IBM LS-21 blade on an IBM BladeCenter H. The LS-21 is running

BIOS 1.05

Qlogic add-on card with BIOS 1.09a and Firmware 2.00.00.62

32GB of RAM

Two 3.0GHz AMD dual-core processors

The LS-21 blade boots from our EqualLogic SAN. During booting of this system when the CD or OS boots when it reaches the “loading qla4022…” message on the console, the system takes 16-22 minutes to continue. We saw the same problem loading ESX 3.5 and decided to try 3.5.1 after reviewing the release notes. However this does not seem to have improved the situation.

All of our other LS-21 systems are running ESX 3.0.2 from the same EqualLogic SAN and they do not exhibit this behavior.

We’ve monitored the activity on the EqualLogic SAN console. During the boot, the ESX 3.5.1 system repeats a pattern of logins and resets.

The pattern:

- Server logons to the SAN

- Server stays connected for 2 minutes, no bytes are reported read or written

- SAN Session is reset

This pattern repeats within seconds of the Session reset 5 or 6 times before the system stays connected and begins loading the remainder of the OS.

We took a sniffer network trace of this activity of this activity. The bios load completes around packet 150, the qla4022 loading message is about packet 3150 and the os continues loading around packet 5300, by packet 61000 the OS is loaded and up.

The Qlogic card is configured with:

Jumbo Frames

Manual IP (no gateway)

Header and Data digest

Target IP address and strings set

Is this a known issue? If so, what is the resolution.

Reply
0 Kudos
74 Replies
adehart
Contributor
Contributor

I'm not ESX expert but I've experienced this exact same issue and have a pretty good handle on what our issue is and it may be perhaps be the same issue you are having. One thing I can't tell from your post is what add-on card is that you have from QLOGIC? Is it a QLA4050 or QLE4060 variant? Also, I know from the experience I've had, to run 3.51 you need FW version 3.0.1.33 which fixes some issues with 3.51. There is in fact a 3.0.1.45 FW in beta that further addresses other issues with the 4060 series of adapters.

Anyway, here is my issue in a nutshell:

IBM xServer 3650 running ESX 3.5.1 booting from iSCSI SAN

QLE4062C w/BIOS 1.13 and FW 3.0.1.33. (IBM version of the QLOGIC adapter is actually being used)

  • One HBA, one port active - Hangs at the QLA4022 driver loading process for 11-15 mins

  • One HBA, two ports active - Does not hang

  • Two HBA, one port active on each - hangs at the QLA4022 driver load process

I was given a test QLA4022 driver by VMWare engineering to try that actually fixes the hang with one port active on one HBA but does not fix the two HBA issue. The two HBA issue probably won't be addressed anytime soon as ESX only supports 2 initiators. I was under the impression we could have 2 initiators active on 2 dual port HBA's and then use the other two ports at some point in the future but VMWare informed me that they do not support that config and an ETA for more than 2 initiator support is quite some time away.

At the moment, I'm probably going to be using two QLA4050C cards instead of the QLA4062C cards which work fine since they are single port cards. My only other option is to go with one physical HBA but that removes our hardware redundancy which isn't what we'd like.

Let me know if that matches up with your experience.

Good luck.

Reply
0 Kudos
christianZ
Champion
Champion

I have noticed you have a older fw there - check this: http://communities.vmware.com/message/897880#897880

In addition I would first try w/o jumbo and data digest. I remeber here someone having big problems by booting with Jumbo.

Reply
0 Kudos
dmarshallx
Contributor
Contributor

We have the Qlogic Chip in the IBM LS21.... It is a QMC4052.

Reply
0 Kudos
dmarshallx
Contributor
Contributor

I forgot to mention that IBM does not support FW 3.01.33

Reply
0 Kudos
BenConrad
Expert
Expert

Reply
0 Kudos
dmarshallx
Contributor
Contributor

Reply
0 Kudos
AlexNG_
Enthusiast
Enthusiast

Hi Dmarshallx,

I've had similar issues with iSCSI. I've already read and tested several things. My workarround was a mix of all the posts mentioned in this thread:

- First of all, downgrade to firmware 3.0.1.27 (none other worked!!!!)

- Check the Advanced Settings in HBA bios, keep all options to disabled, except the first one, and leave MTU as default (1500)

- Configure IP (you'll need to check the TCPIP options once esx is up).

- other options as default

Regards,

AlexNG

If you find this information useful, please award points for "correct" / "helpful".
Reply
0 Kudos
i2ambler
Contributor
Contributor

Hey..Im having this issue now. Trying to boot from san using qla4022 on 2 qlogic 4060c cards. THe cards came with 3.0.1.33. I called qlogic and they said the 4060 does not suppor the older firmware?? Im sorta screwed here with a couple hundred grand worth of equipment if I cant get this to work asap. I have been completely unable to find that older firmware also.. Smiley Sad Smiley Sad I MUST boot from san.

Reply
0 Kudos
adehart
Contributor
Contributor

The 3.0.1.33 is a requirement for the 4060C adapters. You say you are having the same delay issue at boot correct? If so, do both adapters have LUN's attached to them and cables plugged into them? If not, you'll get the delay unless you are using a patched version of the QLA4022 driver that technically isn't released yet. When I tried the patched driver, it fixed our problem. If necessary I can get you the case # to reference with tech support at VMWare that I have to see if they can try the patch with you if it is determined the issue is the same.

I guess the question is, does it work with only one adapter in the system?

I'm told there is also a 3.0.1.45 firmware being released soon but I don't think it addresses this issue but is specific to th 4060C and 4062C.

Reply
0 Kudos
i2ambler
Contributor
Contributor

well, I am just trying to install ESX. only the first card is plugged into the switch. Not only does it delay, it just never loads the driver at all.. I tried removing the second adapter and attempting with just one card - same thing.. it sits at qla4022 driver.. and never loads it... This is very fustrating. I have jumbo frames off in both the cisco and the card., and basically all of the advanced card settings disabled.

Reply
0 Kudos
adehart
Contributor
Contributor

My experience as it relates to our issue was that the QLA4022 delayed for 11-15 mins and then continued on but it worked fine so the issue sounds different. If you want, our case # for the issue was SR# 1107536311. I'd mention it to VMWare support and have them check your issue against the notes in this case. I had to go through a long series of things to isolate the issue including shutting off almost every device in our x3650 and working with one card to verify it was the adapter issue. Have you discussed the case with IBM support too? I'm assuming you are using the IBM flavor of Qlogic adapter and IBM provided us QLA4050C cards to check that it was strictly a card issue.

One final thing is can you load another OS? It helped us a great deal when we knew we could work with a Windows boot but not VMWare.

Unfortunately, I'm not well versed with VMware specifically yet so I can only offer insight as it relates to my situation.

Reply
0 Kudos
TomHowarth
Leadership
Leadership

Thread moved to the VI: ESX 3.5 forum

Tom Howarth

VMware Communities User Moderator

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
Reply
0 Kudos
AlexNG_
Enthusiast
Enthusiast

In our case, the only way to install ESX was to disable everithing on the HBAs BIOS (included IPv4), or remove the hba from the host (q4062c). After installation, the only option so the ESX boots fine (even mapping a LUN) has been downgrading to firmware 3.0.1.27. I know it's not support, and if you call support they'll tell you to upgrade to the newest version, but in our case it was not an option, ESX was not operative in this case!!!

AlexNG

If you find this information useful, please award points for "correct" / "helpful".
Reply
0 Kudos
i2ambler2002
Contributor
Contributor

Where do i find this firmware? Like i said - I MUST boot from SAN. so removing ipv4 is not an option. I have 3 Dell 2950s sitting here with no hard drives that I need to install ESX onto... I have already created the 3 boot luns, and am beginning to install on one of the systems. Qlogic said that I cannot use this older firmware on the 4060 - and I am getting conflicting info from the internet.. Do you happen to know where i can find the older firmware to test? Thanks

Reply
0 Kudos
i2ambler
Contributor
Contributor

Weird.. that was me. I have another account i use for home that i must have been still logged into. Smiley Happy

Reply
0 Kudos
AlexNG_
Enthusiast
Enthusiast

Hi,

Our storage guys gave me that link:

AlexNG

If you find this information useful, please award points for "correct" / "helpful".
Reply
0 Kudos
i2ambler
Contributor
Contributor

Unfortunately that points to the 3.0.1.33 firmware.. grumble I just cant beleve that qlogic would actually tell me that the 3.0.1.27 firmware is not supported with the 4060c - when it clearly states the 3.0.1.27 was a previous version.. man this is bothersome.. My settings are:

Jumbo frames off at switch and on cards

Both cards plugged into switch and pointing to lun on eq iscsi san

all advanced options disabled except delayed ack

ipv4 enabled for boot from san.. (tried without ipv4 enalbed and got the same result.. tried with single card, same result)

ESX sits at qla4022 driver for 14 minutes.. then asks to 'check cd' I skip it.. it goes back to the qla4022 driver again for 14 minutes or so.. then once thats done - it says it cannot find hard drives..

Reply
0 Kudos
adehart
Contributor
Contributor

My understanding was that 3.0.1.27 was supported on the 4060 series of adapters BUT 3.0.1.33 was required for ESX 3.5 because it fixed some issues with it. That in fact was the case for me as the QLA4022 driver failed to load unless 3.0.1.33 was on the card.

As a test, can you in fact load Windows and boot from the SAN with it on the 4060 cards with 3.0.1.33 loaded? That was a good test for me as it pointed the finger towards VMWare vs any hardware issues although admittedly it could still come back to QLogic and a Firmware adjustment.

There is a 3.0.1.45 firmware in beta specifically for the 4060 to address timeout issues although the notes seem to relate more to DHCP and IPv6 types of fixes. You might ask if that is available for you to try though.

Reply
0 Kudos
i2ambler
Contributor
Contributor

Well.. I dont have a usb floppy on hand , however - I did try installing fedora core 9 which has the qla4xxx drivers. It also sits at "initializing firmware" at install.... I guess i will try another machine and set of cards to see if it does the same thing.

Reply
0 Kudos