gamania
Contributor
Contributor

iSCSI MPIO throughput limited to 1G

I have followed the VMware's "iSCSI SAN Configuration Guide" (filename vsp_40_iscsi_san_cfg.pdf) to set up vSphere 4 with software iSCSI MPIO. Round robin is also enabled on the storage to utilize both paths. I can see both paths are active from my Equallogic PS5000X, and are working with active traffic when doing sqoio test. However, the throughput in SQLIO test are maxed out to 1G. Where could the problem be? I imagine the MPIO will result in greater throughput just like what will happen in native Windows set up, but it doesn't seem so.

Tags (2)
0 Kudos
33 Replies
depping
Leadership
Leadership

With round robin you aren't utilizing both paths at the exact same time. you need to used "fixed" and load balance the paths to the LUNs manually to get a higher total throughput. From what I've heard EQL is working on a PSP which should lead to better improved load balancing.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
gamania
Contributor
Contributor

could you explain more about 'manually load balancing'? but if we do MPIO in windows, isn't rr the most popular way to gain better throughput?

0 Kudos
depping
Leadership
Leadership

The document you referred to holds the answer: http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf

Take a look at page 63-65. Especially the diagram on 65 clarifies my comment. You will need to select a different path for each LUN you create. This way all paths can be used at the same time.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging: http://www.yellow-bricks.com

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
gamania
Contributor
Contributor

basically, are you saying we cannot achieve more than 1G for one LUN from one host, even we have MPIO set up using s/w initiator? BTW, the document you referred is the same i have mentioned in my first post, i followed the instructions in the doc to set up MPIO.

0 Kudos
depping
Leadership
Leadership

Well I think you can add a second nic to vmkernel for the EQL.But I haven't tested it personally. You will need to try it.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
gamania
Contributor
Contributor

we are waiting for for quad port nics. i will give it a try and post the status here.

0 Kudos
Rumple
Virtuoso
Virtuoso

The Sw initiator in ESX cannot use more then one nic at a time to access a single target. Running 2 tests at the same time with a VM on each lun (accessed through each nic) will probably show the 2Gbit conenction. 30 vm's on the same LUN will always use 1 pNic...

I am not sure if thats changed in vSphere or not.

In reality though, 1Gbit should be plenty of performance for 20-30 VM's unless you have virtualized alot of Tier 1 apps and then you probably want to either move to hardware iSCSI nic's and/or move to vSphere.

0 Kudos
gamania
Contributor
Contributor

vsphere is different. check out , page 32-34. it does upport multipathing on s/w initiator now. and it does show working from my equallogic monitor screen. but the problem is the combined throughput from 2 paths cannot exceed 1G in total (each path is showing roughly 500M seperately), which is so weird. i don't understand which part is limiting the traffic to 1G.

0 Kudos
Rumple
Virtuoso
Virtuoso

oops...wasn't paying attention and thought I was in the esx3 area...my bad...

Could it be a queue depth problem on the arrany or esx side. What about flow control?

0 Kudos
gamania
Contributor
Contributor

how do i control flow control on esx? but i don't think it would make such huge difference anyway. equallogic side has no control at all. but it works just fine in native windows enviroment.

0 Kudos
AndreTheGiant
Immortal
Immortal

Flow control must be enabled on physical switches.

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andre | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
kunhuang
Enthusiast
Enthusiast

Did you make sure the storage side is using at least 2 gigbit ports for iscs conections?

for eql. it might be using one default port, no matter how many nics at sw iscsi side, that would result in

I11G----->

-


1G---->T1

I2-1G---->

if you are sure iscsi connections are spead as

i1--1g-> -


1g-> t1

i2---1g> -


1g-->t1

it can pententially go 2G, but dependes on array/esx implementation, load balancing algorithsm, IO load pattern, etc.

0 Kudos
gamania
Contributor
Contributor

Here are some of my screen shots

Fixed Path

it's changed from round robin to fixed path during the test. it's very clear to see that eth2 is dropped, while eth0 is maxed out to 1G.

Round Robin

Both eth0 and eth2 are being utlized at roughly 500M bps.

0 Kudos
BenConrad
Expert
Expert

If you are looking for bandwidth testing I'd ditch SQLIO and use IOMeter. Iometer is a very good way to figure out how much data you can fit in each of your 1Gb/s pipes.

Ben

0 Kudos
gamania
Contributor
Contributor

problem solved. it turns out that problem is caused by the testing server is connected to an old testing switch and only get 1G uplink to the new switch. how stupid i am. now i'm happy to see vsphere 4 supporting software iSCSI initiator MPIO natively and working just fine.

0 Kudos
paithal
VMware Employee
VMware Employee

Could you please post results of your experiment with increased uplinks?

0 Kudos
gamania
Contributor
Contributor

i'm in the middle of getting intel Pro/1000 et card working for vSphere 4. i will get the results posted once i'm done with it. it's painful for being early adpoters. Smiley Happy

0 Kudos
Jim_Nickel
Contributor
Contributor

While the document and pages you describe 63-65 do describe a method for doing manual or static load balancing by setting certain paths for certain LUNs, you can certainly use Round Robin to improve performance across the board in a ACTIVE/ACTIVE configuration.

The document you reference indicates this on page 63 for Round Robin:

The host uses an automatic path selection algorithm rotating through all

available paths. This implements load balancing across all the available

physical paths.

Load balancing is the process of spreading server I/O requests across all

available host paths. The goal is to optimize performance in terms of

throughput (I/O per second, megabytes per second, or response times)

I see no reason why you wouldn't just use the Round Robin method in a ACTIVE/ACTIVE configuration. It should provide you with almost double the throughput while still retaining the fail-over capability.

Maybe I am misunderstanding something?

Jim Nickel - VCP

0 Kudos
Atmos4
Enthusiast
Enthusiast

I have seen mixed results with Round Robin vs. Fixed Path on ESX 3.5. I think when using round-robin to two NIC on a single SP, this will result in lower throughput because of the overhead incurred for switching beteen paths and having to send over two tcp connection (more protocol overhead).

So I really suggest testing with IOmeter and/or sqlio and also check what happens if multiple VMs are running I/O. Also make sure your SP CPU and RAID setup is able to generate throughput in excess of 1GBps.

And most importantly you will only need high throughput in rare cases (file server, backup) - in most cases you will have a lot of random I/O that'll never exceed 1GBps.

I don't have any reliable numbers yet for how efficient vSpheres round robin works as compared to the beta implementation in ESX 3.5.

0 Kudos