VMware recently published a paper titled Scalable Storage Performance that delivered a wealth of information on storage with respect to the ESX Server architecture. This paper contains details about the storage queues that are a mystery to many of VMware's customers and partners. I wanted to start a wiki article on some aspects of this paper that may be interesting to storage enthusiasts and performance freaks.
Let's use the following figure as a starting point for this discussion.
For the purposes of this paper, I'm going to call the two different queue types the "kernel queue" and the "device driver queue". The device driver queue is specified in the device itself and has historically been configured through Linux-like module commands in the console operating system. More on that in "Changing Queue Depth" below. The kernel queue should be thought of as infinitely long, for all practical purposes. Any time the device driver queue gets full, commands to the storage will queue up in the kernel.
Note that each LUN gets its own queue. This means that when you change the queue depth in the device driver, you're changing the queue depths for many queues. The underlying device (HBA) is going to have a hard limit on the number of active commands it will allow at one time. This should be considered when setting queue depth. If your HBA can support only 2,000 active commands but it is addressing 40 LUNs, a specified queue depth of 64 won't allow that many commands to all LUNs. This being due to the fact that 64*40 = 2,560--which is more than the 2,000 maximum commands. In practice this is rarely a concern, though, as rarely are so many LUNs being simultaneously addressed through a single HBA and so many outstanding commands being issued to these LUNs.
The device driver queue is used for a low-level interaction with the storage device. It controls how many active, or "in flight", commands there can be at any one time. This is effectively the concurrency of the storage stack. Set the device queue to 1 and each storage command becomes sequential: each one must complete before the next starts.
But if the device queue is left at its default of 32, as an example, 32 commands will be concurrently processed by the storage system. All 32 will be shipped off to the storage device by the kernel and new commands are queued when completions arrive.
The kernel queue can be thought of as kind of an overflow queue for the device driver queues. But it's not just an overflow queue. ESX Server contains all kinds of cool optimizations to get the most out of your storage. And these features apply to commands in the kernel queue only. Here are some examples of features provided to commands queued at the kernel queues:
There are others, as well.
So, increasing queue depths in the device driver can greatly improve the performance of the storage at the device level. Decreasing the device driver queue will result in increases in usage of the kernel queues. This decreases the device efficiency, but introduces opportunities for optimizations across multiple VMs and devices. So, what's the right ratio of these two depths? We think that the sweet spot lies with a depth 32 device driver queue. That's why we've set 32 as the default device driver queue length.
But your configuration and workloads may benefit from a change to this default queue depth. I'll refer you to the aforementioned storage paper for information on when you might want to change the driver queue depth. I'll just point out a couple of broad observations here:
Now that we've covered how storage queuing works, you may be wondering how you can monkey around with these queue sizes for optimal performance. I can tell you as someone that has been involved with many, many performance analysis projects that changing queue size is rarely a fix to an acute storage performance problem. You should first go through the analysis techniques in Storage Performance Analysis and Monitoring. That may or may not lead to changing queue depths.
But, in the event that you do end up changing queue depths...
We have a helpful knowledge base article that describes the process of changing the device driver queue on ESX. For ESXi you will need to modify the queue using the vMA. First find the HBA module name (as the first command does below) then change the depth of the queue against the matching module name using the second command:
It would be great to have an update on ESXi
"Unfortunately, as of today (7/24/08) this document only describes how to change queues through the console operating system. No information is provided for ESXi. I've contacted the KB owner and will have that document updated ASAP."
There is still no update for ESXi on the KB article.....
Just added it.
More information on my blog and on Twitter:
thanks Scott!
Thank you for all this helpful information. Is there a way to also configure this parameter for the iSCSI software adapter? I am using ESX Server 3i, 3.5.0, 110271. Any help and pointers would be greatly appreciated.
Setting Maximum Queue Depth for Software iSCSI:
Does this mean that for iSCSI we can tweak it to the maximum setting / value for the Queue depth?