Hi,
I have a question about the kb Controlling LUN queue depth throttling in VMware ESX / ESXi (1008113).
I have to change these parameters on a HP EVA4400 storage because I found in vmkernel many messages of this type:
Analyzing the vmkernel.log host s1
2014-12-02T04: 03: 24.178Z CPU3: 1.33054 million) NMP: nmp_ThrottleLogForDevice: 2319: Cmd 0x2a (0x412401e1e0c0, 2951955) to dev "naa.6001438005df75c20000500004dd0000" on path "vmhba3: C0: T0: L17" Failed: H: 0x0 😧 0x28 P: 0x0 Possible sense data: 0x0 0x0 0x0. Act: NONE
2014-12-02T04: 03: 24.184Z CPU3: 2951958) NMP: nmp_ThrottleLogForDevice: 2319: Cmd 0x2a (0x412401e82080, 1330588) to dev "naa.6001438005df75c20000500005820000" on path "vmhba2: C0: T0: L13" Failed: H: 0x0 😧 0x28 P: 0x0 Possible sense data: 0x0 0x0 0x0. Act: NONE
2014-12-02T04: 03: 24.188Z cpu5: 1257171) NMP: nmp_ThrottleLogForDevice: 2319: Cmd 0x2a (0x4124401d0f80, 1.27284 million) to dev "naa.6001438005df75c200005000018a0000" on path "vmhba3: C0: T0: L21" Failed: H: 0x0 😧 0x28 P: 0x0 Possible sense data: 0x0 0x0 0x0. Act: NONE
2014-12-02T04: 03: 24.189Z cpu1: 1330592) ScsiDeviceIO: 2318: Cmd (0x4124001dbac0) 0x2a, CmdSN 0x8000002d world from 1,330,537 to dev "naa.6001438005df75c20000500005820000" failed H: 0x0 😧 0x28 P: 0x0 Possible sense data: 0x0 0x0 0x0.
Looking in HP I found this article
I would like to know if these parameters can be changed with the virtual machines running or whether it is better to put the hosts in maintenace mode.
Thank You
Greetings
Hi,
You need to reboot your server, this is device issue but queue depth should be managed on hosts.
Read this KBs to define exact issue by using SCSI sense codes:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902
VMware KB: Understanding SCSI device/target NMP errors/conditions in ESX/ESXi 4.x and ESXi 5.x
Seems, your SCSI device queue is full. You should check your storage port capacity and also you need to check queue depth on ESXi and your HBA driver.
vmkernel: 1:08:42:28.062 cpu3:8374)NMP: nmp_CompleteCommandForPath:2190: Command 0x16 (0x41047faed080) to NMP device "naa.600508b40006c1700001200000080000" failed on physical path "vmhba39:C0:T1:L16" H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.
This status is returned when the LUN prevents accepting SCSI commands from initiators due to lack of resources, namely the queue depth on the array.
Adaptive queue depth code was introduced into ESX 3.5 U4 (native in ESX 4.x) that adjusts the LUN queue depth in the VMkernel. If configured, this code will activate when device status TASK SET FULL (0x28) is return for failed commands and essentially throttles back the I/O until the array stops returning this status.
For more information, see Controlling LUN queue depth throttling in VMware ESX/ESXi (1008113)."
Thanks for the reply.
I'm doing tests with esxi 5.5 build 1892794.
With the default setting doing a esxtop I have these settings:
NPTH AQLEN CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
vmhba0 - 1 1020 0.18 0.18 0.00 0.00 0.00 0.10 0.02 0.12 0.00
vmhba1 - 2 2000 5.92 0.00 5.92 0.00 0.01 0.59 0.00 0.59 0.00
vmhba2 - 1 2000 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
vmhba32 - 0 1024 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
vmhba33 - 0 1024 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
vmhba34 - 0 1024 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
vmhba35 - 0 1024 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
naa.600508b1001c7f7c03c175036d76e7ab - 1020 - 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
naa.600a0b80003698940000000000000000 - 32 - 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
naa.600a0b800036a10a0000000000000000 - 32 - 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
naa.600a0b800036a10a0000050050fd178c - 32 - 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Performing a esxcli storage core device list:
naa.600a0b80003698940000000000000000
Display Name: SUN Fibre Channel Disk (naa.600a0b80003698940000000000000000)
Has Settable Display Name: true
Size: 20
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.600a0b80003698940000000000000000
Vendor: SUN
Model: Universal Xport
Revision: 0670
SCSI Level: 5
Is Pseudo: true
Status: degraded
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.02001f0000600a0b80003698940000000000000000556e69766572
Is Local SAS Device: false
Is Boot USB Device: false
No of outstanding IOs with competing worlds: 32
naa.600a0b800036a10a0000000000000000
Display Name: SUN Fibre Channel Disk (naa.600a0b800036a10a0000000000000000)
Has Settable Display Name: true
Size: 20
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.600a0b800036a10a0000000000000000
Vendor: SUN
Model: Universal Xport
Revision: 0670
SCSI Level: 5
Is Pseudo: true
Status: degraded
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.02001f0000600a0b800036a10a0000000000000000556e69766572
Is Local SAS Device: false
Is Boot USB Device: false
No of outstanding IOs with competing worlds: 32
naa.600508b1001c7f7c03c175036d76e7ab
Display Name: HP Serial Attached SCSI Disk (naa.600508b1001c7f7c03c175036d76e7ab)
Has Settable Display Name: true
Size: 69973
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.600508b1001c7f7c03c175036d76e7ab
Vendor: HP
Model: LOGICAL VOLUME
Revision: 6.40
SCSI Level: 5
Is Pseudo: false
Status: degraded
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.0200010000600508b1001c7f7c03c175036d76e7ab4c4f47494341
Is Local SAS Device: false
Is Boot USB Device: false
No of outstanding IOs with competing worlds: 32
naa.600a0b800036a10a0000050050fd178c
Display Name: SUN Fibre Channel Disk (naa.600a0b800036a10a0000050050fd178c)
Has Settable Display Name: true
Size: 1142360
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.600a0b800036a10a0000050050fd178c
Vendor: SUN
Model: LCSM100_F
Revision: 0670
SCSI Level: 5
Is Pseudo: false
Status: degraded
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.0200000000600a0b800036a10a0000050050fd178c4c43534d3130
Is Local SAS Device: false
Is Boot USB Device: false
No of outstanding IOs with competing worlds: 32
In kb1008113 is written with Esxi 5.1 u1 product versions and higher that the parameter can be given globally.
I want set them globally.
I change these setting in advanced setting - disk and I have changed the values in:
QFullSampleSize 64
QFullThreshold 16
I have restarted the host.
I tried to redo a esxtop but I always get the same values.
I tried to redo a esxcli storage core device list but I always get the same values.
If I set the parameters on the LUN (not globally) by performing the command esxcli storage core device list I see that sets the parameters correctly.
So with version 5.5, the parameter must be set to LUN and can not be made whole?
How should I see if the values have been configured correctly?
How do you restore the default values?
Thank You
greetings