VMware Cloud Community
markokobal
Enthusiast
Enthusiast

Multipathing for Software iSCSI - Load Balancing

Hi,

I would like to fully understand iSCSI Load Balancing with Software iSCSI and I would like to make the best of an low-budget Software iSCSI system. Se my configuration below:

9619_9619.jpg

On my storage I have enabled multi path to single LUN.

On both ESX servers I have enabled Multipathing with Round Robin Selection Policy on the iSCSI Paths (have done everything exactly as it is shown in "iSCSI SAN Configuration Guide" for ESX 4.0 u1. And It works, I would just like to check if I got this correct: so the Load Balancing function for the software iSCSI configuration can not provide more than 1 Gb throughput for single ESX server, no matter how many connections are used for multipathing, is this corret? Because as I have seen the traffic goes like this:

A = path 1 (first cable to the switch and on to Storage)

B = path 2 (second cable to the switch and on to Storage)

x = traffic from ESX 01

traffic:

A B

x /

x /

x /

/ x

/ x

/ x

x /

x /

x /

/ x

...

So only one path is active at one time. This would mean that more then 1 Gb of throughput to the STORAGE can only be achieved when more than one ESX server is active at a time - for example 2 servers - and both are configured for Round Robin multipathing? But not guaranteed, it is a matter of order of requests, in ideal scenario when Round Robin algorithm would be so lucy to select the exact oposite paths on ESX 01 and ESX 02:

A = path 1 (first cable to the switch and on to Storage)

B = path 2 (second cable to the switch and on to Storage)

x = traffic from ESX 01

y = traffic from ESX 02

traffic:

A B

x y

x y

x y

y x

y x

y x

x y

x y

x y

y x

...

So please, am I thinkig right here?

Are there any other methods to get more than 1 Gb throughput on an low-budget Gb iSCSI Software system configuration? Has anyone successfuly configured LACP (Link Aggregation Control Protocol ) on vSphere 4, or it is not supported anyway?

Thanks for help!

--

Kind regards, Marko.

-- Kind regards, Marko. VCP5
0 Kudos
25 Replies
binoche
VMware Employee
VMware Employee

do your storage support port binding?

how many physical paths you have found for each lun?

binoche, VMware VCP, Cisco CCNA

0 Kudos
mittim12
Immortal
Immortal

Interesting blogpost on utilizing multiple vmkernel portgroups and round robin, http://vmetc.com/2009/08/12/vswitch-with-multiple-vkernel-portgroups-for-vsphere-iscsi-round-robin-m...






If you found this or any other post helpful please consider the use of the Helpful/Correct buttons to award points

0 Kudos
vanak
Contributor
Contributor

Hi

I'm actually doing some bench, and i got the same "problem" that you encounter. Round Robin ok, multipath ok, etc etc... But i can never get more than one 1 gb throughput, in spite of a bench with 4 luns, 4 target, 4 vmkernels, 4 ip with 4 differents subnets..

A feeback from a vmware guru about this point will be greatly appreciated Smiley Happy

vnk.

0 Kudos
binoche
VMware Employee
VMware Employee

how many physical nics you have?

can you make sure each of 4 luns is using different nic, different controller storage iscsi port?

binoche, VMware VCP, Cisco CCNA

0 Kudos
vanak
Contributor
Contributor

In this case, im using 4 physical nics, connected to a single vswitch, with 4 vmkernel (failover policy "MAC"). Im using a Netapp, with 4 differents nic. I created 3 luns to make a bandwith test, and make a dd on this 3 luns at the same time.

On monitoring the ESX, all vmkernel and all vmknic are used. On the netapp, all four interfaces are used to. But i never get > than 80MB/s.

It seems trafic is well spreading on all available nics, but seems we cant get more than the 1gb throughput. Is this an ESX limitation ???

0 Kudos
binoche
VMware Employee
VMware Employee

Hi,

vSphere has the feature "port binding" to use all nic bandwidth, please refer to http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf,

binoche, VMware VCP, Cisco CCNA

0 Kudos
vanak
Contributor
Contributor

Well, i've also done the port binding Smiley Sad

0 Kudos
binoche
VMware Employee
VMware Employee

did you see any iscsi failure messages in vmkernel log?

binoche, VMware VCP, Cisco CCNA

0 Kudos
Svedja
Enthusiast
Enthusiast

What kind of connections setup is between your switch and the storage?

If you have ie etherchannel aggregate/"trunk" it will loadbalance on IP (default I think) and you will not get more that 1Gb/s due to the fact that ESX uses one IP an and the storage one IP and the etherchannel will only send the traffic down one of the paths.

But if you have two single-IP connections between the switch and storage then the problem is probably not there.

0 Kudos
vanak
Contributor
Contributor

I got no significative error in the vmkernel log.

And i tried for the network between sw & storage with a direct attached connection. First i let them connected on a switch, but i didn't use any kind of etherchannel.

I connected them directly to be sure there were no problem with the switch...

'But if you have two single-IP connections between the switch and storage

then the problem is probably not there."

I do not really understand, but 4 nics of the storage with 4 differents ip on differents subnet are actually directly connected to the ESX.

0 Kudos
rogard
Expert
Expert

Which iscsi software are you using on your storage device?

0 Kudos
binoche
VMware Employee
VMware Employee

jumbo frame is helpful here? can you enable it and have a retest?

binoche, VMware VCP, Cisco CCNA

0 Kudos
vanak
Contributor
Contributor

Hmm... My storage device is a netapp... So its the ISCSI from DataOnTap.

0 Kudos
vanak
Contributor
Contributor

Test with jumbo frames : same thing.

0 Kudos
markokobal
Enthusiast
Enthusiast

Hi,

The funny thing is that nobody seems to have an answer if there is even an theoretical possibility to get more than 1 Gb throughput on multipath Gb iSCSI configuration on single ESX server?

I guess not, so the catch of ESX iSCSI multipath load-balancing is in using concurrently multiple ESX servers to single iSCSI storage server and so you can utilize more that 1 Gb of iSCSI storage server throughput, but a single ESX server is always at 1 Gb max.

Is that right?

--

Kind regards, Marko.

-- Kind regards, Marko. VCP5
0 Kudos
binoche
VMware Employee
VMware Employee

how about using raid 0 on netapp?

1Gb iSCSI it should be close to 125MBPS per physical path?

binoche, VMware VCP, Cisco CCNA

0 Kudos
rogard
Expert
Expert

Two things:

What switch do you have? (LACP is not supported but portchannel and HPs "trunk" is)

Try a custom multipath policy:

esxcfg-mpath --lun vmhba32:0:8 --policy custom --custom-hba-policy any

--custom-max-blocks 50 --custom-max-commands 50 --custom-target-policy

any

0 Kudos
vanak
Contributor
Contributor

I try all advice here and nothing works... Multipath & load balancing are really ok, put i cant get more than 1gb throughput...

Markokobal raise here an important point in the ESX bandwitch ISCSI staff, can someone give us (at least for markokobal and me), a clear point on this ??

Also i want to thanks all people here who tried to solve this "issue".

0 Kudos
admin
Immortal
Immortal

In order to get more then 1GB/sec using 2 Physical ports on ESX you cannot use NIC Teaming/Aggregation, instead you must create dedicated vmkernel nics for each physical nic in your system. Then manually bind these 2 vmkernel nics to the iSCSI hba.

Instructions on how to do this can be found here: http://tinyurl.com/23w7ub7

Once this is done, ESX should see 4 Paths to your storage per your diagram in the OP. This is because ESX will have 2 iscsi ports and the array is exposing 2 data ports. This should get multiplexed so that we see a path from each ESX port to each Array port. (2x2)

If after following those instructions you are not seeing 4 paths. ( But only 2 ) it could mean that there is a connectivity issue between the ESX host and the array. To diagnose that further we would need to get vm-support logs.

0 Kudos