Recommended configuration for Mapped RAW LUN and S...

djdeceit · ‎01-09-2008

Hi All,

After experiencing massive issues with SAN timeouts on all our VMs with thousands of the following messages on one particular host, and a few on the others in the ESX 3.5 cluster,

Jan 9 16:53:11 cbrep085 vmkernel: 21:06:33:06.952 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=490

Jan 9 16:53:16 cbrep085 vmkernel: 21:06:33:12.025 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=573

Jan 9 16:56:26 cbrep085 vmkernel: 21:06:36:22.033 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=1639

Jan 9 16:56:59 cbrep085 vmkernel: 21:06:36:55.045 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=149

Jan 9 16:59:14 cbrep085 vmkernel: 21:06:39:10.097 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=348

Jan 9 16:59:19 cbrep085 vmkernel: 21:06:39:15.169 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=378

Jan 9 17:09:27 cbrep085 vmkernel: 21:06:49:22.377 cpu2:1062)<6>qla24xx_abort_command(0): handle to abort=823

We narrowed it down to one particular VM configured with 4 x Virtual Mapped Raw LUNs.

All of the Mapped Raw LUNs have place holders located on one of our shared LUNs. VMFS-L-Common01. There are 4 shared LUNs in total for this cluster.

What is the correct way for configuring these on the SAN side? Should all of these Mapped Raw LUNs be configured to use the same Storage Controller as VMFS-L-Common01?

Or should we split these up over the various Common LUNs whilst keeping the Mapped Raw LUNs on the same Storage Processer as the Common LUN it is placed on?

Or is this not an issue when considering the SAN side?

What should we look for in the back end of these configurations. Obviously we have a serious problem here, that seems to be specific to the Mapped Raw LUN setup.

This has happened in 2 seperate cluster / data centres/ SANs.

We have a support call with Vmware open.

Thanks

djdeceit · ‎01-09-2008

Further more here is a picture.

CBR3P111V has 4 raw disks located as placeholders on Common01.

It seems that if any of these are being hammered any guest that is also on the same Common01 lun with its own VMDK is affected. Additionally any guest that is on the same ESX host crawls to a halt.

vdf -h reponds extremely slow when trying to enumerate the LUNs

deceit · ‎03-04-2008

bump. problem still exists and no solution yet. Anyone else have experience?

jeremypage · ‎03-11-2008

I'd be interested with your answer, I am getting the same message qla24xx_abort_command

breewsky5765 · ‎08-17-2008

Hi. Have you figured out the solution? I'm seeing the same errors and its casuing hosts to momentarily lose connection to the VIC.

thanks

big_vern · ‎11-27-2008

same issue here.

All

Recommended configuration for Mapped RAW LUN and Storage Controllers