VMware Cloud Community
za_mkh
Contributor
Contributor

Strange vSphere Issue - Changing Queue Depth Settings

Hi Guys,

Cross posting here and on the expert-exchange forums.

Hoping you can help me with an issue I can't seem to solve. This problem is exhibited on both vSphere ESX and vSphere ESXi. Systems which are patched completely (except for Cisco Nexus patches).

I was changing the queue depth setting for our Q-Logic FC cards (qla2xxx) on our ESX/i hosts

Command to do so is as follows:

esxcfg-module -s -ql2xmaxqdepth=64 qla2xxx

esxcfg-boot -b (this is only for ESX hosts. For ESXi, you just restart the server)

Running this command verifies the option was set

esxcfg-module -g qla2xxx

I also see the value added to the /etc/esx.conf file

You then restart server and all should be ok.

Now the first server, I did this on, worked like a charm, and I confirmed it was working. So I then foolishly did this to three other servers simulataneously. This is when it went bad. The ESX or ESXi server starts up, but it does not load the qla2xxx driver. The hosts are alll nearly identical and the qLogic cards are the same model and identical firmware (2.02). If I load the driver manually using vmkload_mod qla2xxx , it loads with no issues, and all my SAN volumes reappear.

On one server, doing this sorted it out so that the qla2xxx driver loads at boot time

esxcfg-module -d qla2xxx

esxcfg-module -e qla2xxx

However it doesn't work on any of the other ESX/i hosts.

In the meantime, I have taken to manually adding

vmkload_mod qla2xxx to the etc/rc.local file on the affected hosts which means it is the last thing to run when the ESX/i host starts. So it solves the problem for the time being, but not really!

But I'm not happy with my solution ... and I can't seem to work out why it doesn't just load automatically. Can anybody give me a few concrete avenues to look at.

0 Kudos
7 Replies
sholmes
Contributor
Contributor

Did you every get a resolution to this? Check your spelling on this "ql2xmaxqdepth=64". I left the second q out and it bombed off.

0 Kudos
RParker
Immortal
Immortal

Command to do so is as follows: esxcfg-module -s -ql2xmaxqdepth=64 qla2xxx

FYI, with the more ESX hosts you have the qdepth should go DOWN not up.

For instance 4 hosts should be qdepth of 8 (or so). 8 hosts should be more like 4...

0 Kudos
za_mkh
Contributor
Contributor

No answer, unfortunately,

I definitely did not miss the second q in the string - it caught me out when I first tested that setting on ESX 3.5 months ago... so made sure I typed it in properly.

0 Kudos
za_mkh
Contributor
Contributor

That always confused me ... I did read a blog article on how to correctly choose the queuedepth and knew it had to be sized appropriately. However, in our case, we use SANMelody, so the cache that it has always seems to be more than capable of handling this setting.

What we have done is set the queuedepth level on the FC cards and the QLogic driver to 64, but we keeping the the Disk.SchedNumReqOutstanding at 32. The reason is as follows: since my reading on this matter says that the SchedNumReqOutstanding limits the number of SCSI requests that a single virtual machine can send (if there is more than one virtual machine running on the ESX host), I used the logic that if a VM does indeed get stir-crazy with disk i/o requests, it will affect throttle that VM at 32, while still leaving another 32 free for the other virtual machines.

Well so far we have had no issues, so can't really tell!

But point noted for future reference.

0 Kudos
chukarma
Enthusiast
Enthusiast

Did you ever figured out what happened to the host that would load the qla2xxx driver without the workaround? I have a four host cluster running vsphere and after changing the queuedepth, one of the host refused to load the driver. The work around worked but it's doesn't make sense that this has to be done to load the driver.

0 Kudos
chukarma
Enthusiast
Enthusiast

I finally resolved my issue although I am not sure what happened or whether if this was the fix:

I found this error in my host that wasn't able to load the qla2xxx driver:

/var/log/boot-logs/sysboot.log

  • Loading module qla2xxx.o failed. Exec of command '/usr/sbin/vmkload_mod qla2xxx -ql2xmaxqdepth=16' succeeded, but returned with non-zero status: 1*

So, I suspected there was something wrong with the command the kernel was loading and went ahead and re-issued the commands to change the queuedepth:

esxcfg-module -s ql2xmaxqdepth=16 qla2xxx

esxcfg-boot –b

reboot.

Now after the reboot, I no longer see the error in the sysboot.log file and the driver loaded like it was suppose to. I guess, re-issued the commands fixed it for me so hopefully it will for others who are having this problem.

One strange thing I wanted to understand is according to the kb here:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1267

It listed the command to use for Vshpere is:

For QLogic:

vicfg-module -s ql2xmaxqdepth=64 qla2300_707

But I can't find this command in my vsphere and had to use the esxcfg-module command instead.

0 Kudos
iwsmith
Contributor
Contributor

Don't know if you ever solved this but I cleared the setting with:-

esxcfg-module -s '' qla2xxx

i.e. options are NULL

My machine loads the FC modules properly

Ian

0 Kudos