Hi Guys,
Cross posting here and on the expert-exchange forums.
Hoping you can help me with an issue I can't seem to solve. This problem is exhibited on both vSphere ESX and vSphere ESXi. Systems which are patched completely (except for Cisco Nexus patches).
I was changing the queue depth setting for our Q-Logic FC cards (qla2xxx) on our ESX/i hosts
Command to do so is as follows:
esxcfg-module -s -ql2xmaxqdepth=64 qla2xxx
esxcfg-boot -b (this is only for ESX hosts. For ESXi, you just restart the server)
Running this command verifies the option was set
esxcfg-module -g qla2xxx
I also see the value added to the /etc/esx.conf file
You then restart server and all should be ok.
Now the first server, I did this on, worked like a charm, and I confirmed it was working. So I then foolishly did this to three other servers simulataneously. This is when it went bad. The ESX or ESXi server starts up, but it does not load the qla2xxx driver. The hosts are alll nearly identical and the qLogic cards are the same model and identical firmware (2.02). If I load the driver manually using vmkload_mod qla2xxx , it loads with no issues, and all my SAN volumes reappear.
On one server, doing this sorted it out so that the qla2xxx driver loads at boot time
esxcfg-module -d qla2xxx
esxcfg-module -e qla2xxx
However it doesn't work on any of the other ESX/i hosts.
In the meantime, I have taken to manually adding
vmkload_mod qla2xxx to the etc/rc.local file on the affected hosts which means it is the last thing to run when the ESX/i host starts. So it solves the problem for the time being, but not really!
But I'm not happy with my solution ... and I can't seem to work out why it doesn't just load automatically. Can anybody give me a few concrete avenues to look at.
Did you every get a resolution to this? Check your spelling on this "ql2xmaxqdepth=64". I left the second q out and it bombed off.
Command to do so is as follows: esxcfg-module -s -ql2xmaxqdepth=64 qla2xxx
FYI, with the more ESX hosts you have the qdepth should go DOWN not up.
For instance 4 hosts should be qdepth of 8 (or so). 8 hosts should be more like 4...
No answer, unfortunately,
I definitely did not miss the second q in the string - it caught me out when I first tested that setting on ESX 3.5 months ago... so made sure I typed it in properly.
That always confused me ... I did read a blog article on how to correctly choose the queuedepth and knew it had to be sized appropriately. However, in our case, we use SANMelody, so the cache that it has always seems to be more than capable of handling this setting.
What we have done is set the queuedepth level on the FC cards and the QLogic driver to 64, but we keeping the the Disk.SchedNumReqOutstanding at 32. The reason is as follows: since my reading on this matter says that the SchedNumReqOutstanding limits the number of SCSI requests that a single virtual machine can send (if there is more than one virtual machine running on the ESX host), I used the logic that if a VM does indeed get stir-crazy with disk i/o requests, it will affect throttle that VM at 32, while still leaving another 32 free for the other virtual machines.
Well so far we have had no issues, so can't really tell!
But point noted for future reference.
Did you ever figured out what happened to the host that would load the qla2xxx driver without the workaround? I have a four host cluster running vsphere and after changing the queuedepth, one of the host refused to load the driver. The work around worked but it's doesn't make sense that this has to be done to load the driver.
I finally resolved my issue although I am not sure what happened or whether if this was the fix:
I found this error in my host that wasn't able to load the qla2xxx driver:
/var/log/boot-logs/sysboot.log
Loading module qla2xxx.o failed. Exec of command '/usr/sbin/vmkload_mod qla2xxx -ql2xmaxqdepth=16' succeeded, but returned with non-zero status: 1*
So, I suspected there was something wrong with the command the kernel was loading and went ahead and re-issued the commands to change the queuedepth:
esxcfg-module -s ql2xmaxqdepth=16 qla2xxx
esxcfg-boot –b
reboot.
Now after the reboot, I no longer see the error in the sysboot.log file and the driver loaded like it was suppose to. I guess, re-issued the commands fixed it for me so hopefully it will for others who are having this problem.
One strange thing I wanted to understand is according to the kb here:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1267
It listed the command to use for Vshpere is:
For QLogic:
vicfg-module -s ql2xmaxqdepth=64 qla2300_707
But I can't find this command in my vsphere and had to use the esxcfg-module command instead.
Don't know if you ever solved this but I cleared the setting with:-
esxcfg-module -s '' qla2xxx
i.e. options are NULL
My machine loads the FC modules properly
Ian