iSCSI performance with Round Robin

Slingsh0t · ‎11-01-2011

Hi guys,

I've been pulling my hair out trying to figure out why my ESXi 5 hosts are unable to use 100% of their available bandwidth to the storage array on the SAN.

For a VERY simple diagram on the setup I'm dealing with, see attachment "setup.png". This shows how one host is connected to the array. Imagine the config is the same for all other hosts.

As you can see, the host has 3 physical uplinks connected to the SAN, and the array also has 3 ACTIVE uplinks to the SAN. (The other controller is passive so for all intents and purposes isn't a factor in this post).

There is only 1 broadcast domain, Jumbo frames enabled from end to end. All links are 1Gbps Cat5e with a twinax switch LAG in the middle there (a requirement as per Dell's documentation for SANs).

The Actual Issue

When three vmk ports have been configured with IP's as per the first attachment, and then bound to the Software iSCSI HBA, I'm seeing a 33% utilization of these three links. My expectation is when using Round Robin, I should be getting full link utilization of all three uplinks? In the "charts" attachment, you can see that when I remove one of the vmk ports from the iSCSI HBA bindings, link utilization increases on the remaining two SAN uplinks. Finally, I remove another vmk port binding and I get 100% usage of the remaining SAN uplink. WHY is this happening? Is this the way Round Robin is expected to work?

I have played around with the iops policy (set the iops=1 parameter) and found that after a few days storage performance dropped away and when two VM's are competing for storage on the same host they both seem to take a massive hit with storage performance.

Slingsh0t · ‎11-02-2011

Bit of a shameless bump here, 70+ views and not one response?

JohnADCO · ‎11-02-2011

I'm going to call it normal. Round robin becomes much more effective as you add VM's and luns. But really, your not really worried about sustained throughput or even throuput on your MPIO unless it is a unique application like Multimedia. (In which you would not want iSCSI anyways) All your worried about is do your business applications have enough IOPS to perform at their potential.

I didn't respond, because I didn't catch a question in there the first time I read through it.

RParker · ‎11-02-2011

Yeah what JohnADCO says is exactly right. bandwidth is the amount of traffic being SENT to the SAN, but IOPS is the key, you don't need to send huge chunks of packets to manipulate the data, VM Ware uses API and enhanced storage techniques to offload to the SAN anyway, lots of bandwidth means the server is doing the work not the SAN.

Better SAN and VM Ware will give you better efficiency, so watching your network traffic isn't a good measure of performance, look at the SAN IOPS that will tell you the performance you are getting.

Slingsh0t · ‎11-02-2011

Perhaps I'm still not following what you're both saying... To confirm: I should only care about the number of IOPS I get between guest and storage and ignore the the amount of bandwidth being used?

In the past I have tested storage performance with Dell's Multipathing Extension Module which was able to saturate all storage uplinks on the host (and subsequently, the array too) and score an average of 10.5k IOPS. MEM is currently in beta for ESXi 5 hence I'm looking at stock vmware PSP's as an alternative.

The highest number of IOPS I've seen with Round Robin has been around the 3.5k-4k mark. I'm beginning to think the IOPS figure I'm seeing with RR is normal but wanted to seek confirmation here from other members (and perhaps comparisons?).

Another question: If I configure an additional 2 or 3 uplink adapters on a host as storage vmkernel ports, bind these to the same software HBA that already has 2 or 3 storage uplinks, can I expect to achieve a higher number of IOPS when doing a stress test?

JohnADCO · ‎11-02-2011

Maybe. I can attest that even one 1GB link can handle a ton of application iops. Most all IO's are really small so they just don't tax the bandwidth at all.

rickardnobel · ‎11-07-2011

Slingsh0t wrote:
I'm seeing a 33% utilization of these three links. My expectation is when using Round Robin, I should be getting full link utilization of all three uplinks?

How many datastores do you have? If only one datastore then this is very expected. The Vmkernel will send 1000 IOPS using one path to the datastore, then 1000 on the next and then 1000 on the third. This gives a natural spread of 33% utilization that you are seeing.

My VMware blog: www.rickardnobel.se

mcowger · ‎11-07-2011

This is the key. NMP/ RR is round robin, NOT load balancing. You still only use 1 path at a time.

if you want better than that, you need a array/plugin combo that supports using all the paths.

As far as I am aware, there is only one (EMC Powerpath/VE + any EMC array). Otherwise, you are as good as you can get.

(disclaimer, I work for EMC).

--Matt VCDX #52 blog.cowger.us