Hi all,
I have a setup that doesn't get the performance I expected. The current datastore that I have created is dedicated to my testing VM. I test the performance with the following command:
fio --time_based --name=/dev/sdb --size=100G --runtime=30 --ioengine=libaio --randrepeat=0 --iodepth=32 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4 --rw=randread --blocksize=4k --group_reporting
The following hardware is used:
- Dell H740p Adapter - 8x Intel S3710 SATA SSD in RAID10 (2 disks in 4 stripes)
The following software is used:
- Vmware 6.5U1
- Driver lsi-mr3 version 7.700.50.00-1OEM
This results 153k IOPS average, but it should be up to 500k IOPS at average. I've increased the blocksize to 16k and the IOPS didn't change much getting 128k IOPS at average. Checking my adapters iodepth leads to:
ADAPTR PATH AQLEN
vmhba3 - 4040
This should be sufficient, so switching to disk view I get the following:
DEVICE PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cm
naa.61866da0976aec00212af32636c0c7e9 - 64 - 0 0 0 0.00 4.09 3.12 0.58 0.01 0.00 0.16 0.01 0.1
Strange enough other devices are showing load and active command queues, but even during the fio benchmark not much is changing here. It looks like this device is also stuck at DQLEN of 64. Listing this device retrieves the following information:
naa.61866da0976aec00212af32636c0c7e9
Display Name: Local DELL Disk (naa.61866da0976aec00212af32636c0c7e9)
Has Settable Display Name: true
Size: 1523712
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.61866da0976aec00212af32636c0c7e9
Vendor: DELL
Model: PERC H740P Adp
Revision: 5.00
SCSI Level: 5
Is Pseudo: false
Status: on
Is RDM Capable: true
Is Local: true
Is Removable: false
Is SSD: true
Is VVOL PE: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.020000000061866da0976aec00212af32636c0c7e9504552432048
Is Shared Clusterwide: false
Is Local SAS Device: true
Is SAS: true
Is USB: false
Is Boot USB Device: false
Is Boot Device: false
Device Max Queue Depth: 64
No of outstanding IOs with competing worlds: 48
Drive Type: logical
RAID Level: RAID1_0
Number of Physical Drives: 8
Protection Enabled: false
PI Activated: false
PI Type: 0
PI Protection Mask: NO PROTECTION
Supported Guard Types: NO GUARD SUPPORT
DIX Enabled: false
DIX Guard Type: NO GUARD SUPPORT
Emulated DIX/DIF Enabled: false
It seems this device max queue depth is set to 64, which I can't increase:
# esxcli storage core device set -d naa.61866da0976aec00212af32636c0c7e9 -O 255
Unable to set device's sched-num-req-outstanding. Error was:Cannot set device queue depth parameter. sched-num-req-outstanding should be <= 64
Anybody got a clue how to increase the DQLEN?
Just noticed I wrote the wrong command, I used the -m option for increasing the maximum. But it doesn't do anything and clearly the -O flag complains, because the device maximum is set to 64.
Dqlen depends on the driver which you are using and how much it can handle, if your driver allows you to increase more than 64 then you can else you wont be able to.
from the screenshot for device i see that your dqlen is 64 but i dont see much active iops in the queue which means they are getting processed very fast.
When I use the perccli utility I find out that both my adapters are using the same driver lsi-mr3. The first adapter is a Dell H710p adapter, which has a disk device with 320 DQLEN. So it's not directly the drivers limitations, but also not a configurable item within the RAID controller. It's strange that the older adapter is capable of a higher DQLEN.
When I re-run the benchmark with 4k blocksize I get the following data from esxtop:
DEVICE PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
naa.61866da0976aec00212af32636c0c7e9 - 64 - 4 0 6 0.06 8751.21 8750.05 1.16 34.18 0.01 0.11 0.00 0.11 0.00
Hello,
Refer this article Large-scale workloads with intensive I/O patterns might require queue depths significantly greater t...
That document is only related to Qlogic or Emulex cards. There are no configuration options for the LSI driver.
I know this thread is old, but did you ever find a solution to this?
Thanks