Configuring HBA Settings In ESXi 5

joerockt · ‎07-10-2012

I have 4 blades that I've installed ESXi 5 on. The blades have Emulex HBA's. I will be attaching to a Compellent SAN. Compellent has a best practices doc that has very specific settings to change for the HBA's on each host. Unfortunately, the Emulex bios does not contain the settings I need to change, so I need to do this at the OS level. Here are the changes I need to make according to Compellent:

Emulex Fiber Channel Card BIOS Settings

• The Node Time Out field “lpfc_devloss_tmo” (formerly “nodev_tmo”) field should be set to 60 seconds.

o More info: http://kb.vmware.com/kb/1008487

• The “topology” field should be set to 2 for point to point only

• The “queuedepth” field should be set to 255

o This queue depth can be set to 255 because the ESXi VMkernel driver module and DSNRO can more conveniently control the queue depth

I found a doc from Emulex that explains how to do this. When I go to make one change, for example:

esxcfg-module -s "lpfc_devloss_tmo=60" lpfc820

It reflects in the config if I type:

esxcfg-module -g lpfc820

Which results with:

lpfc820 enabled = 1 options = "lpfc_drvloss_tmo=60'

However, when I go to make the next setting change, it seems to overwrite the previous one. I've tried rebooting and then resuming the next change, but it still overwrites.

Any ideas?

akshunj · ‎07-10-2012

Hi Joe,

Have you tried passing multiple paramters at the same time? Similar to:

esxcfg-module -s "lpfc_devloss_tmo=60, topology=2, que depth=255" lpfc820

joerockt · ‎07-11-2012

Ok I tried that. Rebooted, and it shows the settings on that line. But not sure if they are truly taking effect? Any thoughts on how I can test?

joerockt · ‎07-11-2012

One other thing to add is that prior to knowing about the Compellent best practices, we had our SAN controllers reboot on us (luckly one at a time) because the logs rapidly filled with remote port resets from all 4 hosts. Compellent told me about this doc and the specific HBA setttings should clear up those resets. Upon reboot of this host, I saw a few resets in the log, but not as many or as frequent so far. In fact no others since the links came back up 20 mins ago. So I will continue to monitor.

akshunj · ‎07-11-2012

I'm not really sure how you could go about testing those parameters. My best guess is that if the symptoms are clearing up as described by the vendor then you're all set.

joerockt · ‎07-16-2012

So it doesnt look like putting all the parameters on the same line is working. Still getting port resets.

ngarjuna · ‎07-16-2012

One other thing to add is that prior to knowing about the Compellent best practices, we had our SAN controllers reboot on us (luckly one at a time) because the logs rapidly filled with remote port resets from all 4 hosts. Compellent told me about this doc and the specific HBA setttings should clear up those resets. Upon reboot of this host, I saw a few resets in the log, but not as many or as frequent so far. In fact no others since the links came back up 20 mins ago. So I will continue to monitor.

TonyNguyen · ‎08-16-2012

Hi,

I am going through pretty much exactly the same scenario you are going through Joe.

To confirm, all the variables need to be passed through at once however there are a few corrections:

Original:

esxcfg-module -s "lpfc_devloss_tmo=60, topology=2, que depth=255" lpfc820

Corrected:

esxcli system module parameters set -m lpfc820 -p "lpfc_devloss_tmo=60 lpfc_hba_queue_depth=255"

The settings are adjusted to the Compellent best practices. Also, commas and the space between que and depth will cause the input to fail.

Once the changes are made, you can scroll through and verify the right variables have the right paremeters by running this command:

esxcli system module parameters list -m lpfc820 | more

Hope this helps!

joerockt · ‎08-16-2012

Tony, thanks for posting this. I actually disovered the correct syntax a few weeks ago, so my apoligies for not posting sooner.

To add to this, I also needed a way to verifiy that these parameters were taking effect. If you browse to var\log on one of the hosts and open up vmkernel.log, you will see when it initializes the driver, you should see three lines after showing those parameters being loaded.

HTH.

BTW Tony, do you have the same environment of a Compellent SAN and an HP c7000? Are you seeing port resets like I am in your system log as well? I still get them, but not as frequent as before.

TonyNguyen · ‎08-17-2012

Hi Joe, I am also on a HP C7000 chasis. Can you tell me which logs and where you are seeing the port resets? Is there any possible issue with the fibre switch? I remember our Brocade switches needing to be updated to a certain level.

joerockt · ‎08-20-2012

If you go into Storage Manager, select View and System Log. Filter a date range (maybe the last month or so), check Level and choose Debug, and check Display System Level Messages. Check to see if you have any of the following:

Reset Remote Port [wwn] history stats caused by Link Stat index 2

and/or

Reset Remote Port [wwn] history stats caused by Link Stat index 4

See attached as well.

We are on Cisco 9124's, though Compellent seems to think its a configuration issue on the HP side. Though of course HP says everything is configured correctly.

All

Configuring HBA Settings In ESXi 5